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Abstract 

In this paper we study the component structure of random graphs with 
independence between the edges. Under mild assumptions, we determine 
whether there is a giant component, and find its asymptotic size when 
it exists. We assume that the sequence of matrices of edge probabilities 
converges to an appropriate limit object (a kernel), but only in a very 
weak sense, namely in the cut metric. Our results thus generalize previous 
results on the phase transition in the already very general inhomogeneous 
random graph model introduced by the present authors in [4], as well as 
related results of BoUobas, Borgs, Chayes and Riordan [3], all of which 
involve considerably stronger assumptions. We also prove corresponding 
results for random hypergraphs; these generalize our results on the phase 
transition in inhomogeneous random graphs with clustering [5]. 

1 Introduction and results 

Throughout this paper we consider random graphs with independence between 
the edges. The distribution of a random n-vertcx graph with this property is 
of course specified by the matrix of edge probabilities; here we are interested in 
the asymptotic behaviour of the component structure as n — oo, so we shall 
consider a sequence of such matrices. Our main focus is to determine when there 
is whp a giant component, i.e., a component containing Q{n) vertices. Here, as 
usual, an event holds with high probability, or whp, if it holds with probability 
1 — o(l) as n — >■ oo. When there is a giant component, we shall also find its 
asymptotic size. 

For these questions it is natural to focus on (extremely) sparse graphs, with 
0(n) edges, so we shall normalize by considering matrices An whose entries are 
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n times the corresponding edge probabilities. Thus the case in which each An 
has aU (ofF-diagonal) entries equal to some c > corresponds to the classical 
sparse model G{n, c/n). Without some further assumptions, it seems difficult to 
prove asymptotic results, although Alon [T] did so for some questions concerning 
connectedness. As in previous work, the natural additional assumption turns 
out to be convergence to a suitable limiting object, namely a kernel, i.e., a 
symmetric non- negative function on [0, 1]^. Our aim is to relate the asymptotic 
size of the giant component to a suitable function of this kernel. 

The aim described above was also one of the aims of [1], and of BoUobas, 
Borgs, Chayes and Riordan [5]. We shall prove a common generalization of 
the corresponding results from these papers by weakening the assumptions: we 
shall work with convergence in the cut metric (defined below) as in [3], while 
allowing unbounded matrices and kernels, as in [4]. It turns out that these very 
weak, natural assumptions suffice to allow us to relate the giant component of 
the random graph to the kernel. 

To state our results we shall need a few definitions. By a kernel on [0, 1] we 
simply mean an integrable, symmetric function k : [0, 1]^ — >■ [0,oo). We regard 
kernels as elements of L^, so two kernels that are equal almost everywhere are 
considered to be the same. 

Throughout, An will denote a symmetric n-by-7i matrix with non-negative 
entries. If A^ = (aij) is such a matrix, then there is a piecewise constant 
kernel ka„ naturally associated to An'- this takes the value aij on the square 
{{i — l)/n,i/n\ X {{j — l)/n,j/n\. We call k an n-by-n fcerne/if it is of the form 
KA„ for some An- 

There is a (sparse) random graph naturally associated to An, namely the 
graph G{An) = Gi/„(n,A„). This graph has vertex set [n] = {1,2, 
the events that different edges are present are independent, and the probability 
that ij is present is min{ay /n, 1}. If some of the an are non-zero then G{An) 
may contain loops; this will be irrelevant for us here, since we study only the 
component structure of G(^„). Often, it is convenient to consider minor variants 
of these definitions: in the Poisson multi-graph variant, Gp^{An), the number of 
copies of each possible edge ij is Poisson with mean aij /n. In the Poisson simple 
graph variant, Gpo(^ri)7 the probability that ij is present is 1 — exp(— Oij/n); 
in both cases the numbers of copies of different edges are independent. Thus 
Gpo(A„) is the simple graph underlying Gpq(A„). Most of the time it makes 
no difference which variant we consider. Indeed, whenever < n/2, say, for 
all i and 7, then 

G{An) -d Gpo«) (1) 

where =d denotes equality in distribution, and A'n is the matrix with entries 

<j = -n\og{l - a,,/n) = a,j + Oia^jn). (2) 

In the typical case considered here, the entries small compared to n, 

so switching between G(-) and Gpo(-) thus corresponds to a minor change in 
the edge probability parameters. Moreover, under the rather weak assump- 
tions maxij < n/2 and 'A.j — o("-^), the random graphs G(A„) and 
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Gpo(A„) are asymptotically equivalent in the strong sense that they can be 
coupled so that they are equal whp; see [171 Corollary 2.13]. 

Having described the limit object (a kernel), and the random graph, it re- 
mains to describe the notion of convergence. In doing so it is convenient to 
consider somewhat more general kernels. 

Let (iS, /z) be a probability space; most of the time we shall take S to be [0, 1] 
(or (0, 1]) with n Lebesgue measure. A kernel on S is an integrable, symmetric 
function k : 5^ — > [0,oo). Following Frieze and Kannan [T^, for W € L^{S^) 
we define the cut norm of by 

llM^lb.i :=sup / W{x,y)dtiix)dfiiy) , (3) 

S,T JsxT 

where the suprcmum is taken over all pairs of measurable subsets of S. Alter- 
natively, one can take 

\\W\\a,2--^ sup / fix)Wix,y)giy)dfi(x)dtiiy) . (4) 

ll/l|oc,||3||=c.<l JS^ 

In taking the supremum in (j4]) one can restrict to functions / and g taking only 
the values ±1; it follows that 

l|W^lb,i < l|W^I|n,2 <4||W^|b4- 

Thus the two norms || • |b,i and |j • |b,2 are equivalent, and it will almost never 
matter which one we use. We shall write |1 • |b for either norm, commenting in 
the few cases where the choice matters. (There arc further, equivalent versions 
of the cut-norm; see Borgs, Chaycs, Lovasz, Sos and Vcsztergombi [H].) 
Note that for either definition of the cut norm we have 



j W < \\W\\u < \\W\ 



The definition (|4]) is natural for a functional analyst: this norm is the dual 
of the projective tensor product norm in L°°(E)L°°, and is thus the injective 
tensor product norm in L^(E}L^; cquivalently, it is equal to the operator norm of 
the corresponding integral operator L°° — > L^. One advantage of this version 
is the simple "Banach module" property we shall note later in (|23p . On the 
other hand, ([3]) is probably more familiar in combinatorics, and (surprisingly) 
occasionally has a tiny advantage; see Section [3l 

Given a kernel k and a measure-preserving bijection r : 5 — > 5, let k*-"^-* be 
the kernel defined by 

K^'''>{x,y) = K{T{x),T{y)); 

we call K^'^^ a rearrangement of a. We write n ^ k' if k' is a rearrangement of 
K. Given two kernels k, k' on [0,1], the cut metric of Borgs, Chayes, Lovasz, 
Sos and Vcsztergombi [9] is defined by 

Sq{k,k')^ inf lb — K"|b- (5) 
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If we wish to specify which version of the cut norm is involved, we write du^i or 
5a, 2- Usuahy, this is irrelevant. 

As in [5], one can also define using couplings between different kernels, 
rather than rearrangements. In this case it is irrelevant that the kernels are 
on the same probability space. In particular, we may regard a matrix An as 
a kernel on the discrete space with n cquiprobable elements. Then (by an 
obvious coupling) S\j{An, ka„) = 0, where ka^ the n-by-n kernel on [0,1] 
corresponding to An ■ Thus {An , k) = 6\j {ka„ , k) for any kernel k on any 
probability space {S,fi). In the light of this we shall often identify a matrix 
with the corresponding kernel on [0, 1]. 

Throughout this paper, we shall consider sequences (A„) of matrices such 
that for some kernel k we have ^□(A„, k) — > 0. It follows from the results of [l6] 
that for any kernel k on a probability space (5, /x) , there exists a kernel k' on 
[0, 1] with Sa{K, k') = 0. Hence we lose no generality by taking {S, n) to be the 
standard ground space in which S = [0, 1] (or (0, 1]) and ji is Lebesgue measure. 
In this case it is natural to identify An with ka„ as above, and wc may use the 
more down-to-earth formula ([5]) as the definition of 5u- 

To state our results we need two further definitions, from [4] . Given a kernel n 
on a probability space (5, /i), let be the multi-type Galton- Watson branching 
process defined as follows. We start with a single particle in generation 0, whose 
type has the distribution fi. A particle in generation t of type x gives rise to 
children in generation t ~\- 1 whose types form a Poisson process on S with 
intensity K{x,y) dfi{y). The children of different particles are independent, and 
independent of the history. 

We shall also consider the branching processes Xi^{x), x £ S, defined as 
above except that Xk{x) starts with a single particle of the given type x. 

Let p{k) denote the survival probability of X^, i.e., the probability that all 
generations are non-empty. It is easily seen that this is the same as the proba- 
bility that the total number |X„| of particles in is infinite. For basic results 
about p{k), we refer the reader to H]. 

Finally, as in [3], a kernel k is reducible if there exists A d S with < 
IJ,{A) < 1 such that k is zero almost everywhere on A x (5 \ A). Otherwise, k 
is irreducible. 

Throughout, we use standard notation for probabilistic asymptotics as in [18]. 
For example, A denotes convergence in probability, and X„ = Op(/(n)) means 
Xn/f{n) 4 0. 

1.1 Main results 

In this subsection we state our main results; we shall give corresponding results 
for hypergraphs in Section [S] Recall that any matrix denoted by An is assumed 
to be a symmetric n-hy-n matrix with non-negative entries. Given a graph 
G and an i > 1, we write Ci(G) for the number of vertices in the ith largest 
component of G, with Ci(G) = if G has fewer then i components. We shall 
see later that our results imply corresponding results for the Poisson variants of 
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G(j4„); for simplicity we state them only in the original formulation, where the 
edge probabilities are mhi{aij/n, 1}. The theorems are valid for a kernel n on 
any probability space (5,/Lt), but as noted above we may assume without loss 
of generality that S ~ [0, 1], and wc shall do so in the proofs for convenience. 

Theorem 1.1. Let k be a kernel and {An) a sequence of symmetric non-negative 
n-by-n matrices such that S\j{An,K,) — ^ 0. Then Ci{G{An))/n < p(k) + Op(l). 
If K is irreducible, then Ci{G{An))/n p{k) and C2(G(A„)) = Op(n). 

Of course, as usual wc do not require An to be defined for every n, only for 
a subsequence. 

Let Pk.{x) denote the survival probability of the process Xk.{x) started with 
a particle of type x. Let T„ be the integral operator on S with kernel k, defined 

by 

{TJ){x) = / K{x,y)f{y)dp{y), (6) 

JS 

for any (measurable) function / such that this integral is defined (finite or +oo) 
for a.e. x. Note that this class of functions includes every (measurable) function 
/ > 0. Also, let 

||r,||=sup{||T,/||2:||/||2<l, />0}<oo; 
clearly if \\Tk\\ < oo, then WT^W is simply the norm of as an operator on 

Recall from [H Theorem 6.2] that p{k) > if and only if ||T„|| > 1, and that 
if \\Tk\\ > 1, then is the unique non-zero solution / > to the functional 
equation 

/ - 1 - cxp(-T,/). 

Using Theorem ll.li we shall deduce the following slight extension, describing 
the 'critical' value of c above which a giant component appears in G{cAn). 

Theorem 1.2. Let k be a kernel, (An) a sequence of symmetric non-negative 
n-by-n matrices such that 6\j{An, k) — > 0, and c > a constant, and set Gn = 
GicAn). 

(a) Ifc< l|T,||-i, then Ci(G„) = Op(n). 

(b) If c > IIT'kII^"'^, then Ci(G„) = Q{n) whp. Furthermore, if k is bounded, 
then for any constant a < {c\\Ti^\\ — l)/(csupK) we have C'i(G„) > an 
whp. 

(c) If K is irreducible, then Ci{Gn)/n A p{ck) and G2(G„) = Op{n). 

This clearly generalizes the main result. Theorem 1, of BoUobas, Borgs, 
Chayes and Riordan [5], which is simply the special case in which k and the 
entries of the matrices An are uniformly bounded. As we shall see in the next 
subsection, Theorem 11.21 also generalizes Theorem 3.1 of [3|. Note, however, 
that to prove this requires various results from |4]. 

Returning to the irreducible case, we shall also prove a 'stability' result 
analogous to Theorem 3.9 of [3]. 
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Theorem 1.3. Let k be an irreducible kernel and {An) a sequence of non- 
negative symmetric n-by-n matrices such that 5n(A„, k) — > 0. For every e > 
there is a S = 5{K,e) > such that, whp, 

p{K)-e<Ci{G'n)/n<p{K)+e 

for every graph G'^ that may be obtained from On = G{An) by deleting at most 
Sn vertices and their incident edges, and then adding or deleting at most Sn 
edges. 

As we shall show in Subsection l2.6[ using this result it is not hard to deduce 
exponential tail bounds on the size of the giant component. 

Theorem 1.4. Let k be an irreducible kernel and e > a real number. There 
is a J = 7(K,e) > such that whenever (A„) is sequence of non-negative sym- 
metric n-by-n matrices with Sq{A„,k) — > 0, then setting Gn — we have 

F{\CiiGn)-p{n)n\>en) < e"''" 

and 

P(C2(G„) > en) < e-T" 

for all large enough n. 

For the very special case of G{n,p), p = c/n, much stronger results are 
known, establishing the correct dependence of 7 on e in the upper and lower 
bounds. Indeed, such a 'large deviation principle' for Gi{G{n,c/n)) was ob- 
tained by O'Connell [23], and Biskup, Chayes and Smith [2] proved a corre- 
sponding result for the number of vertices in 'large' components. One might 
ask whether these results can be generalized to G(A„); this is likely to be rather 
hard. Indeed, it is not even clear whether they extend to G{An) with A„ con- 
verging to a constant kernel k. 

Remark 1.5. We have stated all our results for a deterministic sequence A„ 
with 5<^{An,K) — >■ 0. In applications, however, the matrices An are often ran- 
dom, and Gn is defined by first conditioning on A„, and then taking the entries 
as giving the conditional probabilities of the edges, which are conditionally inde- 
pendent. The conclusions of Theorems 1 1 . ll l l . 31 are all of the form that G(A„) has 
certain properties whp. Having proved such a result assuming 5[2{Amn) — > 0, 
the corresponding result with An random and (5n(A„,K) A follows immedi- 
ately. One way of seeing this is to note that a sequence En of events holds 
whp if and only if every subsequence has a subsubsequence holding whp. If 
5u{An,n) 0, then given a subsequence (with deterministic indices) of the 
random sequence {An), one can find a subsubsequence such that (^□(A„, k) — ^ 
holds a.s., condition on the matrices in this subsubsequence, and apply the result 
for the deterministic case. 

The rest of the paper is organized as follows. In the next few subsections we 
discuss various applications and consequences of the results above. In Section[2] 
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we prove Theorems I1.1H1.4I as the proofs are somewhat lengthy we shall break 
this section into subsections. Finally, in Section [3] we present extensions of 
our main results to the hyperkernels and corresponding random (hyper)graphs 
considered in [5]. 

1.2 Relationship to the sparse inhomogeneous model 

In this subsection we shall prove a simple lemma which, together with Theo- 
rem [TT^l implies Theorem 3.1 of [1]. This latter result states that (essentially) 
the conclusions of Theorems 11.11 and 11.21 (with c = 1) hold when the random 
graph Gji is an instance of the general sparse inhomogeneous model G^{n, Kn) 
of [3] . Since the full definitions of |3] are rather cumbersome, for this subsection 
only we assume a certain familiarity with the terminology of 0]. 

We say that a kernel k on (iS,/x) is of finite type if there is a finite partition 
(5*1, . . . , Sr) of 5 into measurable sets such that k is constant on each of the sets 
Si X Sj. A key strategy we used in [4] was to reduce results about the general 
case to the finite-type case; we shall use the same approach in this subsection. 
In the rest of this paper we follow a different strategy, using cut convergence to 
directly prove results about the general case. 

The sparse inhomogeneous model G^(n, k„) is defined in terms of a ground 
space V ~ (5,/i, (x„)), and a sequence (k„) of kernels on (S,^). Here {S,fi) 
is a probability space (satisfying some additional assumptions) and each x„ 
is a (deterministic or) random sequence of n points of S, satisfying certain 
technical assumptions. The sequence (k„) is assumed to converge to a kernel 
K in a certain sense, and must also satisfy a certain 'graphicality' assumption 
that involves the sequences x„. For the full technical details, which will not be 
relevant here, see [4]. 

As noted in [H Remark 8.8], in proving results about this model one may 
always assume that the vertex types arc deterministic. In this case G^{n,Kn) 
has the distribution of G'(A„), where An is the matrix obtained by sampling 

the kernel according to the vertex types: A„ has entries Uij = a'j' given by 
a-ij = K„(a;,["'\ x^"'') A n for i ^ j and an =0, where x Ay = min{x, y}. We refer 
the reader to [U for the formal definition of {n, Kn), and in particular for the 
precise definitions of a (generalized) vertex space and a graphical (sequence of) 
kernel (s). 

The next lemma shows that the matrices An associated to G^{n,Kn) do 
converge in probability to the limit kernel k in the cut metric. Although our 
main interest is in the cut distance, we in fact obtain a result for the norm, 
modulo rearrangements. Given two kernels k, k' on the standard ground space, 
let 

Si{k,k')= inf \\k — k"\\l^, (7) 

in analogy with ([S]) . More generally, for two kernels on arbitrary (not necessarily 
equal) probability spaces, we may define Si(k,k') as a certain infimum over 
couplings of these probability spaces; we omit the details. 
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Lemma 1.6. Let V ~ {S,fi, (x„)) be a vertex space, and let (k„) be a sequence 
of kernels that is graphical on V with limit k. Let An be the matrix with entries 
a-ij — Kn{x'f'\ x')!^^ ) A n for i ^ j and an — 0. Then 5i{ka„,h) A and 

5u{An,K) = (5n(K^„,K) A 0. 

Proof. Since H^'Hn < Hk'H^i for any k', we have K2) < (5i(ki, K2) for any 

two kernels, so it suffices to prove the first statement. 

Conditioning on the vertex types, we may and shall assume that the vertex 
types are deterministic. For convenience we assume that S is the standard 
ground space [0, 1]. (The general case requires couplings of k and An, but is 
otherwise the same.) 

Suppose first that k is regular finitary; roughly speaking, this means that 
K, is of finite type. (More precisely, k must be of finite type and must satisfy 
an additional technical condition; see [1].) Suppose also that Kn = k for every 
n. In this case the result is essentially trivial: we may assume that there is a 
partition of S into sets Si, . . . , Sk such that k is constant on each set Sr x Ss- 
The definition of a vertex space ensures that for each r there are ^{Sr)n + o(n) 
vertices i such that Xi £ Sr- Rearranging (or coupling) appropriately, we may 
assume that each Sr is an interval Ir C S = [0,1]. We may then order the 
vertices so that for all but o(n) vertices i the interval (i — 1/n, i/n] lies entirely 
inside the interval containing Xi. After doing so, k and ka^ differ on a set of 
measure o(l). Since both are bounded by supK < 00, it follows that ka„ — !• k 
in and hence in 6a- 

To treat the general case, we approximate by finite- type kernels, as so often 
in [3]. Indeed, by Lemma 7.3 of [3] there is a sequence of regular finitary kernels 
K~ such that K~ < Kn for all n > to and k~ (x, y) k{x, y) for a.e. (x, y) G S^- 
By monotone convergence, we have / — )■ J k as to — >■ oo- Fix e > 0- Then 
there is some m such that = satisfies < k and J{k — k^) < £- 

Let An be the matrix with entries a~- = k~ {x\"'\ x^J^^ ) A n, i ^ j , and 

=0. Considering from now on only n > m, we then have a,^- < Oij and thus 
K^- < KA„ pointwise. After conditioning on the vertex types, the expected 
number of edges in {n, k„) is exactly 



1 ^ — A V A Ojj 1 ^ — ^ ^ — ^ Ojj n 



using an — for the first equality. Thus, by Lemma 8.7 of U, / ka„ J k- 
Similarly (since a finite- type kernel is always graphical), J k^- J k~ - Hence, 

By the finite-type case above, we have i5i(k^-,k") — > 0. Since — < e 

it follows that limsup5i(K^,^, k) < 2e. Recalling that e > was arbitrary, the 
result follows. □ 
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Recall that Theorem 3.1 of [3] states (essentially) that the random graphs 
Gn = G^(n, Kn) satisfy the conclusions of Theorems ll.ll and ll.2l Using Lemma FOl 
by Remark ll.Sl the vertex space case of this result follows immediately from The- 
orems [TTT] and [T21 As noted in [IJ Section 8.1], the apparent extra generality 
of generalized vertex spaces makes no essential difference, so Theorem 3.1 of [3] 
then follows. In other words, we have shown that Theorem 3.1 of [1] may be 
deduced from our present Theorems 11.11 and 11.21 using various results from [4] 
mentioned above. Let us remark that in practice, the conditions of Theorem 
3.1 of [3] will often be easier to verify than those of Theorems 11.11 and 11.21 

1.3 Further applications 

As noted in [S], the definitions in [3] exclude one simple case to which the 
results clearly extend, namely the case of an arbitrary integrable kernel k, 
and i.i.d. vertex types: given a kernel k, one may define the random graph 
G(n,K) = Gi/„(n, k) on [n] by taking a;i,...,x„ to be independent and uni- 
formly distributed on [0,1], and given these 'vertex types', joining each pair 
of vertices with probability min{K(a::i, Xj)/n, 1}, independently of all other 
pairs. With k bounded, a corresponding dense random graph was studied by 
Lovasz and Szegedy [19) . 

Our next lemma shows that Theorems 11.1 1 [T31 apply (unsurprisingly) to the 
graphs G(n, k), since the (random) matrices of edge probabilities associated to 
G{n, k) converge to k in probability in 5\2- 

Lemma 1.7. Let k be a kernel. For n > 1 let Xi, . . . ,Xn be i.i.d. uniform points 
from S, and let An be the n-by-n matrix with entries = K[xi,Xj) for i ^ j, 
and an = 0. Then (Si(^„,k) and (5n(^„,K) 0. 

Proof. As before, we have S\j < Si, so it suffices to prove the first statement. Fix 
e > 0. By standard results there is a finite-type kernel k' such that — k'H^i < 
Indeed, this follows by the construction of the product measure, since the 
rectangular sets A x B generate an algebra that generates the product a- 
field, and it is easily seen that finite linear combinations of indicator functions 
of sets in are dense in L^{S^). 

Let A'^ be the matrix with entries a,^^ ~ k! {xi, Xj), i ^ j , and a'^j = 0. Then 

n n{n — 1) ,, ,„ 9 

¥.\\ka„ - ka'Jl^ = ^ , ' k-k' <e^, 

so with probability at least 1 — e we have 

SiiAn,A'J^Si{KA„,KA'J< Wka^-ka'Jl^ <£■ (8) 

Since k' is of finite type, it is essentially trivial that 5i{A'n, k') A as n ^ cxd; 
the argument is similar to one in the previous subsection, so we omit the details. 
Since 5i{k, k') < \\k — < 

Sl{An,K) < 6i{An,A'J+Si{A'„,K')+Si{K',K), 

and e > was arbitrary, it follows that (5i(A„, k) A 0, as claimed. □ 
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So far we have shown that the results in Subsection 11.11 imply many existing 
results about the giant component in various sparse random graphs. We now 
turn to a new application, giving an example that we believe is not covered by 
known results. 

Let p = p{n) be some normalizing function, with < p < 1 and p{n) — > 0. 
Let Gn be a sequence of graphs in which G„ has n vertices and &{pn^) edges, and 
let K be a kernel. Following the terminology of [6l[7], we say that (5n(G'„, k) 
if S\j{An,K) — )- 0, where An is 1/p times the adjacency matrix of G„. A se- 
quence (G„) satisfying this condition may be thought of as a sequence of in- 
homogeneous sparse quasi-random graphs. For graphs which are dense and 
homogeneous, there are many equivalent definitions of quasi-randomness, or 
pseudo-randomness; see Thomason [25l[26| or Chung, Graham and Wilson [12], 
for example. In the sparse case these notions are no longer equivalent, as dis- 
cussed by Chung and Graham [11] in the homogeneous case, and Bollobas and 
Riordan [6] in general; when k is constant, normalizing so that k = 1, we have 
k) — > if and only if 

sup \e{Gn[V]) - p\V\y2\ ^ o{pn'); (9) 

VCV(G„) 

this condition is called DISC in [TTj. Other, stronger conditions have also been 
considered, in particular by Thomason [25j [26]. Our next result establishes 
the threshold for percolation on an arbitrary sequence of inhomogcncous sparse 
quasi-random graphs. 

Theorem 1.8. Let c > be a constant, letp = p{n) he any function with c/n < 
pin) < 1, let K be an irreducible kernel on [0, 1]^, and let (G„) be a sequence of 
graphs with |G„| = n and ^□(G„, k) — >■ 0. Writing G'„ for the random subgraph 
of Gn obtained by selecting each edge independently with probability c/{pn), we 
have Ci{G'n)/n A p{cn). In particular, the threshold value of c above which a 
giant component appears in G'j is given by \/\\Ti^\\. 

Proof. As above, let A„ be 1/p times the adjacency matrix of G„. Then, by 
assumption, (5n(A„,K) — > 0, so (5n(cA„,cK) — > 0. The random subgraph G^ is 
exactly G(cA„), so the result follows from Theorem ll.il □ 

As noted in [6], one way to construct inhomogeneous sparse quasi-random 
graphs is to consider appropriate random graphs, but this is not so interesting 
in the present context: the random subgraphs of such graphs end up being the 
graphs G{n, k) considered at the start of the subsection. A more interesting 
application of Theorem 11.81 is to deterministic quasi-random graphs. In the 
homogeneous case, where k = 1 is constant, many such sequences are known. 
One example is given by the 'polarity graphs' of Erdos and Renyi [Tl], defined 
(for suitable n) by taking as vertices the points of the projective plane over 
GF{q), q a prime power, and joining x = {xo,xi,X2) and y = (2/0 , 2/i : 2/2 ) if 
and only if .Toyo +2:12/1 + 2:22/2 = in GF{q). Here n = q^ + q + 1 and 
p = {q + l)/n = 6(n~-^/^). Other examples are the coset graphs of Chung [10] 
and the Ramanujan graphs of Lubotzky, Phillips and Sarnak [20]. In all these 
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examples the limiting kernel is constant, so Theorem 11.81 says that on any of 
these graphs, the threshold for percolation is when the average degree of the 
random subgraph is equal to 1. 

Note that in the examples above, the matrices to which Thcorcm ll.il 
or Theorem 11.21 is applied are very far from satisfying the uniform boundedness 
condition assumed in BoUobas, Borgs, Chayes and Riordan [3]. Indeed, each 
An has all entries either or 1/p, where p = p{n) 0. This also implies that 
the corresponding kernels ka,^ , which do converge to k = 1 in the cut norm, do 
not converge in various natural stronger senses, such as pointwise or in L^. 

In general, it is very hard to compute the cut distance between two kernels. 
Indeed, if Ai and A2 are the adjacency matrices of two graphs, then the general 
problem of computing Sa{KAi , '^^2) includes as a special case deciding whether 
Gi and G2 are isomorphic. Thus applications of Theorems 11.11 and 11.21 are 
likely to involve special cases where cut convergence is guaranteed for some 
simple reason, such as the example in the previous subsection. 



1.4 Consequences for branching processes 

Theorem 11.11 has an interesting consequence purely concerning branching pro- 
cesses. Recall that if k is a kernel, then p{k) denotes the survival probability of 
the multi-type Poisson Galton- Watson process X^. 

Theorem 1.9. Let Km, m > I, and n he kernels with 5\2{nm, k) ^ as m ^ 00. 
Then p{Km) ~^ pi^)- 

Proof. Let us first note that the result is not really a statement about the cut 
metric 6\j, but rather about the cut norm || • Indeed, by definition of 6\j 
there are rearrangements kJ^ of Km with — < Sa{Km,K) + 1/m, say, 
and hence ^ ^ 0. Since p(K'm) = p{Km.), in proving the result we may 
assume if we like that — k\\\j — >■ 0. 
We shall prove the result in three steps. 

Step 1: suppose that all k,„ ai'c irreducible; this case is the heart of the 
proof. For each m we may find a sequence ^i'"'* of symmetric n-hy-n matrices 
with S\j{A^\ Km) — as n — 00. Indeed, this is an immediate consequence of 
Lemma 11.71 By Theorem ll.il if n is large enough, then 

p(\Ci{G{Al^''>))/n- p{Km)\ > l/m) < l/m', (10) 
say. Pick n{m) such that ([TU| holds and i5n(^''',"'' Km) < 1/™, and let Am = 



(m) 

"^^y By ((TU)) . with probability 1 we have 

Cl{G{Am)) 



\GiAm)\ 



p{Km) 



0. (11) 



Now S[j{Amn Km) < 1/m by our choice of n(m), while (5n(K„i,K) — > 0, so 
Sa{Am,K) — > 0. Applying Theorem 11.11 again, we have Ci{G{Am))/\G{Am)\ < 
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p{k) + Op(l). Together with ([TT|) this imphes that 



Umsupp(K„i) < p{k). 



(12) 



If K is irreducible, then we have Ci{G{Am))/\G{Am)\ A p{k), so p{Km) 

as required. We shah return to the lower bound in the case that k is reducible 

later. 

Step 2: wc now consider the general case, where some of k and the 
may be reducible. By Theorem 6.4(i) of [4], given a kernel k' and a sequence 

tending pointwise down to k', we have p(kI^) p{k')- Applying this with 
= Km and = Km + 1/n, say, we see that for each m there is an e„i < 1/m 
such that \p{Krn) ~p(^m)| < wlicrc = K,„ +£,„• Now is irreducible, 
and \\k'^ — K„i\\n ^ 1/m — >■ 0, so (Sn(Kj„,K) — > 0, and the results of Step 1 
apply. In particular, the upper bound p2)) holds, and if k is irreducible, then 
pium) -> as required. 

Step 3: in the case where k is reducible, it remains to prove the lower bound 
corresponding to (fT2|) . For this we decompose k into irreducible kernels as in [4] . 
As shown there (in Lemma 5.17), given any k there is a finite or countable 
partition (S'i)^o, iV < oo, of 5 into measurable sets such that n = X)i>i '^^^^ 
holds a.e., where each k'-*^ is zero off Si x Si and irreducible when restricted 
to Si X Si. Fix e > 0. Since p{k) = X]p(^^'^)' there is some k < oo such 
that X]i=i p{kS^^) > pl'i) ~ £• Define Km to be the kernel that is equal to k^ 
on Si X Si and zero off this set, and let k^ = X^iLi • Then k^ > k^, so 
/3(Km) > ^(Km) = Y!i=iP{i^m)- Sincc ||Km - K||n > HkIi' - K^^^Hn for cach i, 
we have ||Km — k^^^Hq — )• for each i. Since k''^ is irreducible, by the result of 
Step 2 we have p{Km) — > p(k'-*'). Summing over i from 1 to fc it follows that 



Note that Theorem ll.Ql is a purely analytic statement about branching pro- 
cesses and the cut metric (or cut norm - rearrangements change nothing here). 
However, the only proof we know is that above, which goes via graphs! Corre- 
sponding results with much stronger assumptions (monotone convergence, either 
upwards or downwards) were proved in [4]; these weaker results were all that 
was needed there. 

Wc close this section by giving a direct proof of a weaker form of Theorem ll.91 
assuming convergence. As above, rearrangement is irrelevant, so it makes no 
difference whether we suppose that (5i(k„, k) — > or ||k„ — kH^i — > 0. 

Theorem 1.10. Let k„, n>l, and k he kernels on a probability space {S,fi), 
with j|K„ — k\\li — >■ as n ^ oo. Then p{K.n) ~^ Pii^)- 



k 




Since e > was arbitrary we thus have liminf; 
with this completes the proof. 



m 



■oo 



p(Km) > p(k). Together 
□ 
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The proof will be based on weak-* convergence. Let /„, n > 1, and / be 
functions in L°°(5, /i). The definition of the weak-* topology on is 
that /„ / if and only if 



g{x)fn{x)d^i{x) / g{x)f{x)dfi{x) for every g g L^{S,fi). (13) 



Lemma 1.11. Suppose that k G L^{S x S) and /„ G L°°{S,fi) with fn 0. 
Let hn = Ti^fn, so h„{x) = J n{x,y)fn{y)d^.{y). Then /i„ Q in L^{S,fi). 

Proof. Note first that by the uniform boundedncss principle we have C = 
sup||/„||oo < CO- (In fact, in the application, each /„ is bounded by 1.) 

Let e > 0. As in the proof of Lemma \1.7l there is a finite- type kernel k' 
such that \\k — k'\\li < e. We may express k' as K.'(x,y) = X^iLi 
for ipi, ipi G . (In fact, we may take each ipi or ipi to be a constant times a 
characteristic function.) Now 



^nilLi 



i^ix,y)fniy)dn{y) 



LI 



< 



N 



\{K{x,y) - K'{x,y))f„{y)\ d^i{x) d^{y)+^\\ / ipi{x)ipi{y)fn{y) dfi{y) 



Li 



The first term above is at most |1k — 't'lUi ll/nl|oo < eC*. The second term is 
exactly 



N 

X^ll'^'lli^ / My)fniy) 



dfi{y) 



Each integral tends to zero by the definition (fT3|) of the weak-* topology, so 
it follows that limsup < eC. Since e > was arbitrary, the result 

follows. □ 



With this preparation behind us, we turn to the proof of Theorem 1 1.1 01 



Proof of Theorem \1.1(A We may assume without loss of generality that the a- 
field T on S where ^ is defined is countably generated, and thus L^{S,fj,) is 
separable. One way to see this is to note that otherwise we can replace T by 
a countably generated sub-cr-field To such that each k„ is J-q x .Fg-measurable; 
alternatively, by the results of |16| we may assume without loss of generality 
that S = [0, 1], with ^ Lebesgue measure. 

Suppose for simplicity that k is irreducible; arguing as in the proof of The- 
orem [T^l it is not hard to reduce the general case to this case. 

Suppose for a contradiction that ||k„ — ~^ but p{k„) -/^ pin). Passing 
to a subsequence, we may assume that |p(k„) — /9(k)| is bounded away from zero. 
To obtain a contradiction it then suffices to show that for some subsequence 
{urii) of (k„) we have pium) ^ p{i^)- 
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Let pn{x) = Pk„{x) be the survival probability of the branching process 
Xk„{x), started with a single particle of type x. As shown in [4], the function 
Pn satisfies 

p„ = l- cxp(-r^„/9„). (14) 

It is well known that the unit ball of L°°{S,p) is sequentially compact in the 
weak-* topology when L^{S,p) is separable. (The unit ball of L°° is always 
compact, but not necessarily sequentially compact otherwise.) For the special 
case 5 = [0, 1], let (/„) be a sequence in the unit ball of L°°([0, 1]). This sequence 
has a subsequence (/n^) such that Jj fn^ converges for each of the countably 
many intervals / with rational endpoints. Since the are uniformly bounded, 
this is enough to ensure weak-* convergence. 

Since ||/Ori||oo < 1 for every n, by sequential compactness there is some p* S 
L°°(S,p) and some subsequence of (k„) along which /?„ p* . From now on 
we restrict our attention to such a subsequence. 

Now 

WTk^Pu - Ti^PuWl^ < \\Kn - '«1|li1|/0„||oc < \\Hn - ^ 0. 

Also, by Lemma [nH ||r,p„ - T^p*\\l^ ^ 0. Hence r,„p„ ^ T^p* in L\ 
Passing to a subsequence, we may assume that Tf^^pn — > Tk/j* a.e. But then, 
using (HH), 

p„ = 1 - e"'^""''" ^ 1 - e-^-'P' a.e. 
From and dominated convergence, it follows that 

p„^l-e-^«''*. 

Since p„ p*, it follows that p* ~ 1 ~ e"-^"'' a.e. 

Let p{x) denote the survival probability of X^{x). Since k, is irreducible, by 
[H Theorem 6.2], either p* ~ p a.e. or p* = a.e. In the first case. 



P{l^n) = J Pn{x)dp{x) -> j p*{x)dp{x) = p{n), 

as desired. In the second case, we have p^Hn) — >■ similarly. 

All that remains is to rule out the possibility that p(k„) — > < p{k). This 
is not hard using the results in [J. For M > 0, let k^^ denote the pointwise 
minimum of k and M, and define similarly. Suppose that p{k) > 0. Then 
||T„|| > 1. As shown in the proof of [H Lemma 5.16], we have ||T„m|| \\Tk\\ 
as M — !• oo, so there is some M with c = HTkaHI > 1- such an M. Since 

< \\k„-k\\li ^0, (15) 

and the kernels k*^ and k^^ are uniformly bounded, we have — >■ HT'kmH = 
c > 1. In particular, for all large enough n we have |1T);A/|| > {c + l)/2 > 1. 
Finally, it follows from [H Remark 5.14] that we have 



M^ ^ > (c~l) /2 

sup K^^ ~ M 



Since p{Kn) > p(k^/) it follows that p{Kn) 7A 0, and the proof is complete. □ 
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If we assume cut convergence instead of convergence, then using the fact 



in place of the corresponding observation for the norm, the first part of the 
proof above goes through unchanged, showing that p* ^ p a.e. or p* — > 0. 
Unfortunately, we do not know how to exclude the possibility that p{Kn) ^ 
< p{k), except by appealing to Theorem 1 1.1[ i.e., working with graphs. The 
problem is that the relation equivalent to (|15p for the cut norm rather than the 

norm does not hold in general. Of course, given that Theorem II. 91 is true, it 
is almost guaranteed that it has a direct analytic proof. 

As discussed in [6] Section 2], until recently there was another example of 
an analytic fact about kernels whose only known proof involved graphs (and 
the cut metric), namely that two bounded kernels may be coupled to agree a.e. 
if and only if their 'graphical moments' (or subgraph counts) are equal. This 
follows from the results of Borgs, Chayes, Lovasz, Sos and Vesztergombi [5] 
concerning metrics for graphs (see [6]). However, by now there are analytic 
proofs: Janson and Diaconis [T3| showed that it also follows from results of 
Hoover and Kallenberg on exchangeable arrays. A direct (and far from simple) 
proof has recently been given by Borgs, Chayes and Lovasz [5]. 



In this section we shall prove our main results; the strategy of the proof of Theo- 
rem ll.ll is as follows. First, in Subsection l2.11 we shall show that if each k„ is an 
n-by-n kernel and (5n(K„, k) — > 0, then almost all of the weight of k„ comes from 
values that are o(n). This will allow us to assume that all edge probabilities in 
G{An) are o(l). It then follows that the expected number of small tree compo- 
nents in G{An) is close to what it 'should be', i.e., n times a certain function of 
the kernel ka„- In Subsection 12.21 we show that this function is continuous with 
respect to the cut metric. This then tells us that we have almost the 'right' 
number of vertices in small components; the details are given in Subsection l2.3l 
Finally, in Subscction l2.4l wc complete the proof of Theorem 1 1.1 1 bv showing that 
in the irreducible case, almost all vertices in large components are in a single 
component, using a method from Bollobas, Borgs, Chayes and Riordan [3]. In 
Subsection 12.51 we treat the reducible case, proving Theorem 11.21 Finally, in 
Subsection 12.61 we prove our stability and concentration results. Theorems 11.31 
andO 

For convenience, in this section we assume, as we may, that all kernels are 
on [0, 1], unless exphcitly stated otherwise. 

2.1 Eliminating large edge weights 

In Theorem 2.1 of [7] it was shown that if (G„) is a sequence of graphs in which 
Gn has n vertices and 0{n) edges. An is the adjacency matrix of G„, k is a kernel 



that 




oo 



2 Proofs of Theorems 11.11 - 11.4 
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and S\j{nAn, k) — >■ 0, then k = a.e. and e(G'„) = o{n). A simple modification 
of the proof gives the following lemma. Recall that a matrix denoted An is 
assumed to be n-hy-n. 

Lemma 2.1. Suppose that k is a kernel and (An) a sequence of non-negative 
matrices such that — > 0. Then there is some function M{n) with 
M{n) = o(n) such that only o{n) entries of An exceed M{n), and the sum of 
these entries is oin?). 

A consequence of this is that if A'^ is obtained from An by taking the point- 
wise minimum with M{n), then SqIA'^, k) — > 0. 

Proof. Although the details are almost exactly the same as in [7] , we spell them 
out. We write k„ for 

Since (5n(/t„, k) — >■ 0, we may choose rearrangements k''^"-' of k such that 

||K„-K("")|b^O. (16) 

It suffices to show that for any c > 0, the sum of the entries of An exceeding 
cn is at most c^n^ for n large enough. This implies that there are at most cn 
such entries, and the result then follows by letting c tend to 0. 

Suppose for a contradiction that there is some c > such that, for infinitely 
many n, the sum of the entries of A„ exceeding cn is at least c^n^; from now on 
we fix such a c and restrict our attention to the corresponding values of n. Let 
Gn be the graph whose edges correspond to those entries of An which exceed 
cn. Let Mn be a largest matching in G„. 

Suppose first that \V{Mn)\/n — >■ 0. Let S'„ be the subset of [0, 1] correspond- 
ing to the vertex set of M„, so fi{Sn) = \V{Mn)\/n — >• 0. Every edge of weight 
at least cn meets a vertex of M„, so 

r ^ ^ ^ 2 

/ , , ^" = ^ 2n^^''''^' " 

where the factor 2 accounts for the double counting of edges within V{Mn). 
From (|16p . writing S'n for t„(5„), we have 

[ > / - o(1) > cV2 - o(1), 

s;, x[04] Js„x[o,i] Js„x[o,i] 

so Jg, x[o 1] 7^ ^- Since fJ,{S'n x [0, 1]) — fJ.{S'n) = fJ,{Sn) 0, this contradicts 
integr ability of k. 

Passing to a subsequence, we may thus assume that for some a > 0, every 
maximal matching Af„ meets at least an vertices. 

Since k, is integrable, we have / k1{k>c} — s- as C ^ oo, where 1{k>c} • 
[0,1]^ — > {0,1} is the indicator of the event that K{x,y) > C. In particular, 
there is a C < c» with J Kl{^>p-j < ac/4. Fix an n with n > 4C/(ac), noting 
that if S" C [0, 1]2 satisfies fi{S) < 1/n, then 



IS 



K < Cfi{S) + I k1(«>c} < C/n + ac/4 < ac/2. (17) 
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Choosing n large enough, we may assume from ([T5| that there is a k' = k*^"^"^ ^ k 
with 

||k„ - k'IId < ac/25. (18) 
Given subsets U and V of let 



A^{U,V) = J2Y. 



ueu vev 



Let M„ = {ui^i, . . . , M^Wr} be a matching in Gn with r > an, and set 
?7 = {ui} and V = {^^i}- Identifying subsets of [ti] with the corresponding 
unions of intervals of length 1/n, from (|18p we have 



c/xy 



< ac/25. 



Let [/' be a random subset of J7 obtained by selecting each vertex independently 
with probability 1/2, and let V' be the complementary subset of V, defined by 
V — {vi : Ui ^ Ui}. The edges of our matching M„ never appear as edges from 
U' to v. On the other hand, any other edge UiVj, i ^ j, from U to V has 
probability 1/4 of appearing. Hence, 



Similarly, writing S C [0,1]^ for the union of the r 1/n-by-l/n squares corre- 
sponding to the edges UiVi, we have 



E 



U'xV 



UxV 



Combining the last three displayed equations using the triangle inequality, and 
noting that fJ,{S) ~ r/ri^ < 1/n, it follows that 



E 



U'xV' 



> 



> 



An? 

{an){cn) 



'100 



ac/8- ac/lOO > ac/16. 



4n2 



using PT)) . On the other hand, from ([T51) . 



U'xV 



< ac/25 



always holds, which implies a corresponding upper bound on the difference of 
the expectations. Since ac/25 < ac/16, we obtain a contradiction, completing 
the proof. □ 
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2.2 Tree integrals and the cut metric 

In this subsection we shall show that a certain function of a kernel whose role 
will become clear later is continuous (in fact Lipschitz) with respect to the cut 
metric. Here there is no particular reason to consider only the standard ground 
space; instead we consider an arbitrary probability space. 

Let {S, J^, n) be a probability space. Let W be the set of all integrable 
no n- negative functions W : S x S — ^ [0,oo), and let Wsym be the subset of 
symmetric functions. The integr ability assumption is for convenience only; the 
results extend to arbitrary measurable non-negative functions if one is a little 
careful with infinities in the proofs. However, we shall only need the integrable 
case. 

For W eW, let 

Xw{x):^ j^W{x,y)dti{y) (19) 

and 

X'y,{y) := / W{x,y)d^l{x) (20) 
Js 

denote the marginals of M^; we allow the value +00, although by our assumption 
that W is integrable, \w{x) < 00 a.e. and \'y{/{y) < 00 a.e. Note that Xw and 
X'y^ are measurable functions from S to [0, 00]. 

Throughout this subsection we work with Q as the definition of the cut 
norm: if € L^iS'^), then 

\\W\\a.= sup / f{x)g{y)W{x,y)dfi{x)dfi{y) . (21) 

ll/l|oo<l,||9||o=<l JS^ 

It is immediate from the definition ((2T|) that 

\\W\\n<\\W\\LHs-) (22) 
and that, for any bounded functions h and k on S, 

\\h{x)k{y)W{x,y)\\a < ||/i||oo||fc||oo||W^|b. (23) 

Before stating the main result of this subsection, let us note that if two 
kernels are close in cut norm, then their marginals are close in i^. (This is 
doubtless well known, but in any case very easy to see.) 

Lemma 2.2. // Wi,W2 G W, then \\Xw, - Xw2\\l^{S) < \\Wi ~ M^2|b- 
Proof. If / e L°°{S), then 

(XwA^) - ^W2ix))fix)dfi{x) ^ / .f{x){Wi{x,y) -W2ix,y)) d^{x)dfi{y) 

and the result follows from (j2ip . letting g(y) = 1 and taking the supremum over 
all / with II /II 00 < 1- (Or simply taking f{x) equal to the sign of Xwi{x) — 
Xw2{x)-) □ 
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We now turn to the integrals we shall consider, one for each finite graph F. 
Given a finite graph F with vertex set {1, . . . , r} and W S Wsym, let 

Usoi{F,W):= f II W{x„Xj)f[e-^'^(="''>dfi{xi)... dfi{xr). (24) 

ijG-E(F) fc=l 

The reason for the notation is that tiso\{F, W) corresponds roughly to 1/n times 
the expected number of isolated copies of in a certain random graph defined 
from W. 

Our aim in this subsection is to prove the following result. 

Theorem 2.3. Let F be a tree. Then W i— >■ tisoi{F,W) is a bounded map on 
Wsym that is Lipschitz continuous in the cut norm. In other words, there exists a 
constant C (depending on F only) such that tisoiiF, W) < C for all W £ Wsym, 
and W) - tisoiiF, W')\ < C\\W - W'\\a for all W, W S Wsym- 

We shall prove Theorem l2.3l via a sequence of lemmas. The first step will be 
to transform (|24p to an integral of a product over edges only, rather than over 
edges and vertices. This will involve considering asymmetric kernels, as well as 
different kernels for different edges of F. 

Given a tree F with r vertices in which each edge has an arbitrary direction, 
and for every edge ij € F el (not necessarily symmetric) kernel Wij S W, set 




Note that the exponential factors e •^^''(^'«) present in are missing from 

(USD. 

We shall reintroduce the exponential factors by attaching them to the kernels 
Wij. Recalling the definitions of the marginals Xw and A'^y in and (PU)) . 
for real a,b > let 

W^''^''\x,y) := e-''^»'(^)W(a;,y)e-^^'»'(^). (26) 

Finally, let di be the (total) degree of vertex i in F. Then, comparing ([M)) and 
(PS)) . for every symmetric : 5^ — > [0, oo) we have 

tisoi(^^, W) = to{F, (M^(i/'^-i/^^)),,). (27) 

To study tisoi{F,W), we shall first study the map W ^ W(°'''\ and then 
study the behaviour of to on the restricted set of asymmetric kernels that arise 
as images of this map. 

Lemma 2.4. For every fixed a,b > 0, the map W i— >■ W^"''''^ is Lipschitz con- 
tinuous on W in the cut norm; more precisely, 

iM"'"^ ~wt''^\\n<7\\Wi-W2\\o 
for all W\,Wi e W. Also, for every W e W, sup,^ Xwi-^.b) (x) < e~^/a and 
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Surprisingly, this turns out to be the hardest part of the proof of Theo- 
rem O 

Proof. Let us start with the final inequalities, which are immediate consequences 
of the inequality ie~* < e~^. Indeed, 

Js Js 

and similarly A'^^^,,. ,, (y) < e^'^/b. 

Turning to the main assertion, let Wi,W2 G W. To simplify the notation 
set Xj := and := A'^y^. for j = 1,2. It will turn out that we have to 
argue separately according to which of Ai(a;) and A2(a;) is larger, and similarly 
for Xi{y) and X'2{y). Accordingly, define the indicator functions 

h{x) := l[Ai(x) < A2(x)], h{x) := l[Ai(x) > X^ix)], 

I'M := i[x'M < x'Ml m ■■= AKiy) > Kiy)h 

so h{x)+h{x)= I'M +I'2{y) = I. 

We may write w[°''''^ — W2°''''\ a difference of two three-term products, as a 
telescopic sum of three terms in the usual way. In particular, we have 

-t- e-''^2(x)g-6A;,fe) _ 

It will turn out that this decomposition is only useful when Ai {x) < A2 (a;) and 
X'liy) < X'2{y), so we shall multiply by the indicator function Ii{x)I'i{y). 

To bound the final term in (gg), note that < /i(a;)e-''^2(^) < 1 and 
< /((y)e-''^2(a) < 1^ so from ^ we have 

||/i(a:)/Uy)e-"^^(^)e-''^^(«)(W^i(x,2/) - W2{x,y))\\^ < \\Wi - W^Wa- (29) 

For the remaining terms we estimate the norm, recalling (|22p. Turning 
to the first term, by the mean value theorem, if Ai(x) < A2(a;) then for some 
y e [Ai(a;), A2(a;)] we have 

g-aAi(x) _ g-aA^W ^ a\Xi{x) - X2{x)\er''y < a\Xi{x) - X2{x)\e'''^''-''\ 

where Ai(x) < A2(a;) is used in the final inequality. It follows that 

/i(a;)|e-''^i(^) - e-'^^^^^'l < a|Ai(x) - A2(x)|e-'^^i(^^ 
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Thus, 

< \\a |Ai(x) - X,{x)\ e-''^^(-)M^i(x,y)||^,(^,^ 

a \Xi{x) - X2{x)\ e-'^^^'^^^Wiix, y) dfi{y) dfi{x) 



= / a|Ai(a;)-A2(x)|e-°^i(^)Ai(a;)d/i(a;) 



< / |Ai(a;) - A2(x)| dpL{x) = e-^\\\i - \2\\ms) 



s 

<e-^\\Wi-W2\\a, 

where we used te^* < e^^ for the second last step and Lemma [2.21 for the final 
step. 

Similarly, for the second term in ([^5]) we obtain the bound 

||/i(.T)/((2/)e-'^^^(-)(e-''^'i(^) - e-^^'-^y'>)W^{x,y)\\^,^^,^ < e-^Wi - W2\\n- 

Putting these two bounds together with (|29p . comparing with ([25)1 wc sec that 

h{x)I[{y){wi''''\x,y) - Wt'''\x,y)) ||^ < (1 + 2e~')m - W2\\n- (30) 

So far we treated the case Ai(.t) < X2ix), X'i{y) < X'2{y)- The remaining 
three cases are treated similarly. 

More precisely, for Xi{x) < X2{x), X'i{y) > X'2{y), we use 

^ia,b) _ ^ia.b) ^ (g-aAi(x) _ e^^A, (.) ) ^-^ (,) ^ ) 

+ e-''^2(x)g-bA;(y) (_^iYi{x, y) - W2{x, y)) 
+ e-''^^^^) i^e-^Kiy) „ e~^^'^^y^)W2{x, y) 

in place of (|28p to prove the equivalent of ([50]) with Ii{x)l2{y) in place of 
hixKiy). 

For Ai(x) > A2(2;), X[{y) < X'2{y) we use 

+ e-'^Ai(:r)g-bAi(y) _ ^^(^^ 

+ (•g-aAi(a;) _ e-''^-'^'''>)e-''^'^^y'>W2{x, y) 

to obtain a bound with l2{x)I'i{y) as the indicator function. 
Finally, for Ai(a;) > A2(a;), X'i{y) > X'2{y) we use 

^ia^b) _ ^ia.b) ^ e-aAi(x)g-6A;(,)^^^(^^y) _ W2{x,y)) 

+ e"-^^^ (e-''^'i (■^'^ - e-^^^ (2^) ) H/2 (a;, y) 
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for h{xmy)- 

The key point is that in aU cases, when we come to apply the bound obtained 
from the mean value theorem, when dealing with a term — 5-0^2 (a;) 

obtain a bound involving e"'*''^^) for i = 1 or 2 depending on which of \i{x) 
and \2{x) is larger. For the rest of the argument to work, it is important that 
the term we consider contains a factor Wi{x, y) rather than W3_i(x, y). Similar 
comments apply to the e~^'*'i'^-' — e"^'*'^'?') terms. Fortunately, we can ensure 
that this is always the shown by the decompositions above. Informally 

speaking, we simply choose the right moment to switch from Wi to W2- 

Combining ([50)) and its equivalents, noting that Ii{x)I[{y) + Ii{x)l2{y) + 
h{x)I[{y) + /2(x)/^(y) = 1, we see that 

Ij^Kb) _ < (4 + 8e-i)||W^i - W^Wn < 7||Wi - W2\\u- □ 

Remark 2.5. Although we do not care about the constant, let us note that 
the four estimates ([29|) above can be combined into a single application of ([23]) , 
with h[x) = /i(a;)e-^^(^)+/2(a;)e-^i(^) and k{y) = I[{y)e-^'-^^v^ + I'2{y)e-^'^^y\ 
This gives 1 + 8e~^ < 4 in place of 4 + 8e~^ 

We next turn to the study of tQ{F,-) as defined by (|25p . restricting our 
attention to kernels with bounded marginals. It turns out that we must first 
study a related function ti, which may be seen as a rooted version of <o- 

Given a rooted directed graph F with vertex set {1, 2, . . . , r} and root 1, 
and functions Wij £ W, let 

ti{F,{Wij)ijfzE{F);xi) -.^ Y\ Wtj{x,,Xj)d^i{x2) ■ ■ ■ dn{xr). 

Note that this is a function of xi G S, and that 

to{F,{W^,),jeE{F)) ^ I h{F,{W,/),j^EiF);x)dfi{x). (31) 
Js 

Let WB:={WeW: sup, Xw{x), sup^ X'^^iy) < B}. 

Lemma 2.6. Let F be a rooted directed tree and (Wij)ij(zE{F) 0, family with 
Wij G Wb for all ij . Then for all x € S, 

ti{F,{W,,),j^E(F);x) < B<^\ 

Proof. A simple induction on the number e{F) of edges of F. If e{F) ~ 0, so F 
consists of just a single vertex, then both sides are equal to 1. For e{F) > 0, pick 
a leaf w of that is not the root, with neighbour w. We may assume without 
loss of generality that the edge wv is oriented from w to v. In the integrand 
appearing in the left hand side above, there is only one factor that depends on 
Xv, namely Wwv(xw,Xv). Integrating out over Xy, this integrates to Xw^^^{xw). 
Replacing Xw„^ (xw) by B, which is an upper bound by assumption, we see that 
that ti{F, - 'jx) < Bti{F ~ v,-;x), and the result follows by induction. □ 
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Returning to the unrooted case, we are now ready for the final step in the 
proof of Theorem 12.31 

Lemma 2.7. Let F be a directed tree, and B < oo a constant. For all families 
iWij)ij^E{F) and iWlj)^j(zE(F) with Wij,Wlj g Wb, we have 

to{F,iW,,),j^EiF)) <B<^^ (32) 

and 

\to{F, {W.,heEiF)) - to{F, {Wl^heEiF))\ < B<^^-^ ^ \\W., - 

(33) 

Proof. The bound p2p is immediate from pip and Lemma 12.61 by choosing an 
arbitrary root. 

For the Lipschitz estimate ([55)) , it suffices to treat the case where the famihes 
Wij and Wlj differ only on a single edge ij, say ij = 12. In this case, let Fi and 
F2 be the two components of F \ {12}, and regard these as rooted trees with 
roots 1 and 2, respectively. Then, simplifying the notation, 

to{F,{W^j)^j) = / ti{Fi;xi)ti{F2;x2)Wi2{xi,X2)diJ.{xi)diJ.{x2) 

and similarly for {Wl^). Thus, by ([2T|) . 

\to{F,{W,,),,)~to{F,{Wl^),,)\ 

= / ti{Fi;xi)ti{F2;x2){Wi2{xi,X2) -Wl2{xi,X2)) dfi{xi)dfi{x2) 

< ||ii(Fi)||oo||ii(F2)||oo||VKi2 - Wi2\\a- 
The result follows by Lemma [2.61 □ 
Putting the pieces together. Theorem 12.31 follows. 

Proof of Theorem \2.S[ In the light of (P7)) , this is immediate from Lemmas 12.41 
andO □ 

2.3 Small components 

Let Nk{G) denote the number of vertices of a graph G in components of order 
A;, and let pki^^) denote the probability that consists of exactly k particles in 
total. Our next aim is to prove the following lemma. Recall that is always 
assumed to be n-hy-n. 

Lemma 2.8. Let (A„) he a sequence of non-negative symmetric matrices con- 
verging in 5[2 to a kernel k, and let k > 1 be fixed. Then E,Nk{G{An))/n — >■ 

Pk{K). 
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As usual in sparse random graphs, the dominant contribution will be from 
tree components. We start with a simple lemma showing that cyclic components 
can be neglected. 

Let us call a sequence of non- negative symmetric matrices (in which 

An is n-hy-n as usual) well behaved if all the diagonal entries are zero, and 
max^„ — o(n), where max An is the largest entry in An- One useful property of 
such sequences is that for them, the models G{An) and Gpo{An) arc essentially 
equivalent, as shown by the following simple lemma. 

Lemma 2.9. Let k be a kernel and let {An) be a sequence of well-behaved 
matrices with Su{An, k) — >■ 0. Let A'^ be the matrix with entries defined by 
Then (5q(j4'j, k) — > 0. 

Proof. For n large enough that max aij < ?i/2 , say, from ^ we have | aij — a[j \ = 
0{afj/n), with the implicit constant C absolute. It follows that 

\aij - a-^ l < Ojj/n < C maxja^j/n} ajj = 0(1)^^ fly , 

ij ij ij ij 

using the well-behavedness assumption. Since S\j{An, n) — > 0, we have ^aij ~ 
n'^ J K = 0{n'^). Hence 

Sa{i^A„,i^A'J < \\KAr, - i^A'JIl^ = y^ \aij - a'ijl = o(l), 

ij 

and the result follows. □ 

The point of Lemma [2.91 is that if we can prove that Gpo{An) has a certain 
property whenever i5n(A„,K) — ^ 0, then the same result for G(A„) follows: we 
simply express G{An) as Gpo(A^) as in dU), and apply our result for Gpo(-) to 
the sequence {A'n). 

Our next lemma shows that the graphs we consider have few vertices in 
small components containing cycles. Let Nl{G) denote the number of vertices 
of a graph G in tree components of order k, and iV^(G) the number in cyclic 
components of order fc, so iVfc(G) = iV^(G) + N^{G). 

Lemma 2.10. Let (An) be a sequence of well-behaved matrices and k > 2 an 
integer. Then ¥,N^{Gn) ~ o{n), where Gn = Gp^{An). 

Note that in this lemma there is no convergence assumption. Note also that 
Lemma 12.101 immediatelv implies a corresponding result for Gpo(A„), which is 
simply the simple graph underlying Gpq(A„), and so satisfies N^{Gpo{An)) < 
N^{Gp^{An)). It also implies a corresponding result for G(A„); this may be 
deduced from the result for Gpo(A„) by expressing G(A„) as Gpo(A'J as above. 

Proof. We shall consider an evolving version G„(i) of G„. To define this, for 
each possible edge ij, construct a Poisson process on [0, 1] with intensity aij/n; 
the points of these processes will be the birth times of the ij edges. Let G„(i) 
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be the graph formed by all edges born by time t, noting that the number of ij 
edges in G'„(l) is Poisson with mean aij/n. Taking the processes independent, 
G'„(l) thus has the distribution of G„ = Gp^{An). 

Let M<k{G) denote the number of cyclic components of a (multi-)graph G 
of order af most fc; thus A^^(G) < kM<k{G). 

Let f{t) denote the expectation of M<fc(G„(t)); then /(O) — and /(I) = 
EM<fe(G'„), so EA^^(G„) < fc/(l), and it suffices to show that the derivative 
of / is bounded above by o(n). Condition on G„(t), and consider the edges 
born in a short time interval [t, t -\- At]. Taking dt small enough, the probability 
that there is more than one such edge in any interval [t, t + dt] is negligible. 
The only way we can have M<k{G + e) > M<k{G) is if e joins two vertices 
i, j in some component of G of order at most k. There are at most kn such 
pairs of vertices. Since the uniformly bounded by o(n), the probability 

aijdt/n of adding e = ij is o(dt), and the probability of adding some such edge 
is o{kndt) = o{ndt). Adding such an edge increases M<k by at most 1, so the 
expected increase in time dt is at most o{ndt) as required. □ 

We are now ready to prove Lemma 

Proof of Lemma \2.8[ We claim that it suffices to prove the lemma under the 
assumption that (A„) is well behaved, i.e., maxA„ = o{n), and the diagonal 
entries are 0. 

To sec this, note that by Lemma [2.11 there is some 5 — 5{n) such that 
at most 5n entries of An exceed 5n, and the sum of these entries is at most (5n^. 
Define A'j^ = {a[j) by setting a[j = if aij > 5n or ii i = j , and setting a^^- = 
otherwise. Then 

5n(A„,A;) < -^^|a„- -a^jl = ^ ^ "'J + au<5 + 6^o{l). 

aij>Sn i:aii<5n 

Hence 5u{A'^, k) — > 0, so the sequence A'^-^ and kernel n satisfy the assumptions 
of the lemma, and {A'^) is well behaved. In establishing our claim we may thus 
assume that 

¥.Nk{G{A'J)/n^ Pk{n). (34) 

But then the same result for G(A„) follows almost immediately. Indeed, we 
may assume that G(AJJ C G(A„), and we have 

E(i?(G(AO)\i?(GK))) =E(e(G(A„))-e(GK))) < i ^ |a,,-a:,| = o^. 

Since adding an edge to a graph G changes Nk{G) by at most 2fc, it follows that 

E|7Vfc(G(A„)) - Nk{G{A'^))\ = o{n), 

which with ([34]) proves the same statement for An, establishing the claim. 

From now on we suppose as we may that (A„) is well behaved. In the light 
of Lemma [2.91 we may work with Gpo(^n) instead of G{An)- In fact, we shall 
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work with Gn = G'pq(A„), which has exactly the same component structure as 
Gpo(A„). 

Given a loopless muhi-graph F on [k] and a sequence v = (vi , . . . ,Vk) with 
1 < Vi < n for each i, set 

Pv(F)=pv(F,A„)= n ^ n (35) 

ijeE(F) uw:{u,w}n{vi}^^ 

where the second product is over aU edges uw of the complete graph on [n] 
meeting {vi,...,Vk). 

Let us call a sequence v = (ui, . . . ,Vk) good if the Vi are distinct, and 6ad 
otherwise. If is a simple graph and v is good, then Pv{F) is the probability 
that the vertices vi,...,Vk of Gn = G^^{An) form a component isomorphic 
to F, with the ith vertex of F mapped to Vi. Hence, writing npiGn) for the 
number of components of Gn isomorphic to F, for simple F we have 

En.(G„) - ^ E VAF). 

V good 

Our aim is to relate this sum with F a tree to tiso\{F, HAri)i a-nd hence to 
Usoi{F,i^)- 

Let Are(x) denote the marginal of k, defined by (|19p . For 1 < i < n, set 

Ati(«) = - y^flij, 
n 

j 

so A„ is essentially the marginal of ka„- (More precisely, A„(i) gives the value 
of the marginal of at any point of the interval of length 1/n corresponding 
to vertex i G [n].) 

Given a multi-graph F and a (not necessarily good) sequence v, let 

pl{F)^pl{F,An)^ n ^ne-^"(-). (36) 

Expanding each term Xn{vi) and then comparing psp and (j36p . we see that if 
V is good then the only difference is that certain factors exp(— a„u,/n) appear 
twice in ([55)) and only once in ([55]) . namely such factors with u,w G {wi, . . . , Vk}- 
Since there are (2) = 0(1) such factors and each is (by our well-behavedness 
assumption) 1 + o(l), we have 

p^iF)^pl{F) (37) 
uniformly in good sequences v. Hence, for simple F^ 
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Specializing now to the case of a tree T on [k], recalling (|24p we have 



k 



tisoi{T, kaJ 



= n 



k 



V ij£E{T} i=l 



SO 



V 



Our next aim is to show that 



(39) 



V bad 



Once we have done so, it follows from the formulae above that 



EnT(G„) 



o(n) + (1 + o(l))n 



tiso\{T, KA„) 

aut(T) 



(40) 



In any sequence v contributing to (|39p . at least one pair Vi, Vj coincides. 
Since an = for every i, we may assume that if ij g E{T), then Vi ^ Vj. Let us 
fix a pattern of coincidences, i.e., decide for which pairs we have Vi = Vj. 

The contribution to (p9| from a given pattern may be bounded by 



where F is the multi-graph formed from T by identifying the appropriate ver- 
tices, and wi, . . . runs over the distinct vertices among ui, . . . , u^. Indeed, 
the only difference is that in the contribution to (|39p we have factors e-'*i-'^"(™i) 
rather than e^'^"^"''^ in ((4T|) . where d; > 1 is the number of the Vj that are 
mapped to Wi. 

Note that F is connected. If F is simple, then using (|38| again we have 



since np{Gn) < J^- Moreover, if F is simple and not a tree, then by Lemma r2.10l 
we have X{F) = o{n). 

If F is not simple, let F' be the underlying simple graph. Then the terms 
of the sums defining F' and F are in one-to-one correspondence, and each term 
for F' is the term for F multiplied by e{F) — e{F') > 1 factors of the form 
Qij/n. Each such factor is o(l), so we have X{F) = o(X(F')). We have just 
seen that X{F') = 0{n) for any connected simple F', so if F is not simple we 
have X{F) = o{n). 

Recall that we could write the sum in ([5^ as a sum of over 0(1) patterns of 
terms each bounded by X{F) for some graph F arising from identifying some 
sets of non-adjacent vertices of T. Any such graph contains either a cycle or 
one or more multiple edges, so X{F) ~ o{n) in all cases, establishing ((39)) . As 
noted above, (|40| follows. 




(41) 



w good 



X{F) - ant{F)EnF{Gn) = 0(n), 
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Recall that Su{ka„,i^) 0. By Theorem 12.31 we thus have tiso\{T, ka„) — >■ 
tisoi{T, k) < oo, so 

EriTiGn) = ntisoi(T, k)/ aut(T) + o{n). (42) 

Let Xk = T denote the event that the branching process when viewed as 
a tree is isomorphic to T (which implies that it has total size k). We claim that 

m. = T) = ^^t,UT,^). (43) 



In fact, the version of (|43p for a rooted tree T, which is the same except that 
the factor k is omitted, is easily proved using induction on k (see [3]), and then 
(|43|) follows easily by summing over the different roofings of T. 

Hence, summing over all isomorphism types of trees on k vertices. 



J, aut(T) 



and from (|42|) . 



\ T ) T ' 



EiV^(G„) = E ( fc > ^ nT(G„) | = fc" > ^ ^-^^^^ + = + 



Since EiV^(G„) = o(n) by Lemma[23ni it follows that ¥.Nk{Gn) = Pfc(K)n+o(n) 
as required, where G„ = Gf^{An)- Since Gpq(A„) and Gpo(A„) have the same 
components, the corresponding statement for Gpo(v4„) follows immediately, so 
we have proved a version of Lemma [221 for the model Gpo(-). As noted earlier, 
bv Lemma 12.91 Lemma [Z!8l follows. □ 



Lemma 2.11. Let {An) be a sequence of non-negative symmetric matrices con- 
verging in 5\2 to a kernel k, and let k > 1 be fixed. Then Nk{G{An))/n A Pki^). 

Proof. As in [3] or [3] , this extension of Lemma 12.81 requires almost no extra 
work: simply repeat the proof of Lcmma [2.8l but considering pairs of components 
of order k to show that with N = Nk{G{An)) we have EiV^/n^ — > pk{K)'^. Since 
EiV/n pkin) by Lemma it follows that Var(iV/n) = o(l), so N/n is 
concentrated about its mean. □ 

As in [4] or [3] we have the following corollary, where N>i^ = J2k>uj ^k- 

Corollary 2.12. Let (An) be a sequence of symmetric n-by-n matrices converg- 
ing in du to a kernel k. Then whenever lj = ijj{n) tends to oo sufficiently .slowly 
we have N>i^(G{An))/n A p{k). 

When we have completed the proof of Theorem II. 1[ it will follow (arguing 
as in the proof of Theorem II. 21 in the reducible case) that Corollarv l2.12l in fact 
holds for every Lu(n) oo with Lu{n) = o{n). 
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2.4 Connecting the large components 

To complete the proof of Theorem 11.11 we shall use a modified form of the 
Erdos-Renyi 'sprinkling' argument to show that almost all vertices in 'large' 
components are in fact in a single component. We need a strengthened form 
of a lemma implicit in BoUobas, Borgs, Chayes and Riordan [3]. Before stating 
this, let us recall another lemma from [3| (again modified, but this time in a 
trivial way). By an (a, b)-cut in a kernel k we mean a partition {A,A^) of [0, 1] 
with a < fi{A) < 1 — a such that Jj^^^^^ k < b. 

Lemma 2.13. Let k be an irreducible kernel, and let < a < be given. There 
is some b = b{K, a) > such that k has no (a, b)-cut. 

Proof. The same statement is proved in [31 Lemma 7], but for graphons, i.e., 
bounded kernels; all kernels considered in |3] were bounded. Although as it 
happens we shall only use the bounded case, we may as well note that the 
restriction is entirely irrelevant. Indeed, irreducibility of a kernel k depends 
only on whether certain integrals are 0, and hence only on the set where k > 0. 
So if K is irreducible, so is the pointwise minimum k' of k and 1. If k has an 
(a, 6)-cut, then so does k', so the result follows from the bounded case. □ 

Here then is the key lemma that wc shall need. 

Lemma 2.14. Let k be an irreducible kernel and 6 > a constant. There are 
positive constants a = a{K, S) and c = c{k, S) such that for every sequence (An) 
of non-negative symmetric matrices with SaiAn, k) — ^ 0, for all large enough n 
we have 

V{X y) > 1 - cxp(-cn) 

for all disjoint X , Y C [n] with \X\, \Y\ > Sn, where X ~fc Y denotes the event 
that the graph G(A„) contains at least k vertex disjoint paths starting in X and 
ending in Y . 

A version of this lemma, but with the additional condition that the kernel k 
and entries of the matrices An are uniformly bounded, is implicit in [3] (see [U 
Lemma 4.2]). Although the basic strategy of the proof of Lemma [2.141 is the 
same as that in [3], dealing with unbounded kernels requires considerable care, 
so we shall write out the proof in full. 

Proof. We write (a^) for the entries of A„, suppressing the dependence on n. 
As before, by Lemma l2.1l we may assume that maxa^ = o(n), and in particular 
that Qij < rt/100, say. We may also assume that S < 1/10, say. 

Throughout this proof we view A^ as a (dense) weighted graph. In particu- 
lar, given sets V and W of vertices of An, i.e., subsets of [n], we write 

eiV, W) = J2T. "™ 
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for the total edge weight from V to W. Similarly, for v E [n] and W C [n], 



wew 

Let = K A 1 be the pointwise minimum of k, and 1. Since (5q(A„, k) — > 0, 
there are rearrangements k„ of k such that 

\\ka„ - Kn\\n ^ 0. (44) 

Let = K„ A 1, noting that is a rearrangement of . 

Identifying a subset of [n] with the union of the corresponding intervals of 
length 1/n in [0, 1], for subsets V and W of [n] we set 

JVy.W 

From (|44)) there is some T](n) — )• such that 



\e{y,W) - eo(T/,iy)| = 



('*A„ ^n) 



for all V and IF. Since k > k , so eo{V, W) > Cq {V, W), it follows that 

e{V,W)>e'^{V,W)~n'^ri{n). (45) 

By Lemma [2.131 there is some 6 > such that k" has no (5/2,6)-cut. We 
may and shall assume that b < 1/10, say. Since each is a rearrangement of 
K~ , no K~ has a (5/2,&)-cut. 

Fix disjoint sets X and F of vertices, each of size at least 6n. Arguing as 
in [3], we shall inductively define an increasing sequence Sq, Si, . . ., of sets of 
vertices in a way that depends on A^, X and Y, but not on the random graph 
G{An). There will be some additional complications due to unbounded matrix 
entries; it turns out we can sidestep these with appropriate use of the inequality 
(SSI). 

We start with Sq — X, noting that |5o| > 6n. We shall stop the sequence 
when \St\ first exceeds (1 — S/2)n. Thus, in defining St+i from St, we may 
assume that 6n < \St\ < (1 — 6/2)n. Since k~ has no (^/2,&)-cut, we have 



v^St 

Let 



Tt+i^{v(^ St:e^{v,St)>bn/2}. 
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Since k„ < 1 holds pointwise, {v, St) < \St\ < n for any v. Thus 

Jin hn ^ 

bn^<eo{SlSt)<-\[n]\{StUTt+,)\+n\Tt+^\ <"—+n\Tt+^\. 

Hence iTt+ij > Set St+i = 5f U Tt+i, and continue the construction until 
we reach an Si with l^^] > (1 - d/2)n. Note that i<2/b= 0(1). 

We shall now turn to the random graph G{An), uncovering the edges between 
Tt and St-i, working backwards from Ti. It will be convenient to set Tq = Sq, 
so St = U5=o^J- Since l^^l > (1 — 5/2)n, while |y| > 5n, the set Se contains 
at least Sn/2 vertices from Y. Since So = Tq = X is disjoint from Y, it follows 
that there is some to, 1 < to < i, for which Tt^ contains a subset Yo oiY with 

|>o| ><5n/(2^). 

Next, we aim to construct a set Xo C S'to-i with \Xo\ > 5|yo|/10 such that 
every x G Xo is joined to some j/ G Iq by an edge of G{An). In fact, we shall 
look for a partial matching from Yo to Stg-i of size exactly 

N^b\Yo\/lO; 

we ignore the irrelevant rounding to integers. Let us list the vertices of Yo as 
{yi, . . . ,ys}- We shall test each yi in turn to see whether it has a neighbour 
in S'to-i; the complication is that we must avoid vertices of Sto-i that are 
neighbours of earlier j/j . We shall also stop looking for new neighbours if we 
already have a large enough matching. 

Formally, we inductively define subsets Zo, Zi, . . . , Zs of St^^i, starting with 
Zo = 0. For 1 < i < s, if = N then we set = Z,^i. li\Z,^i\ < N and yt 

has a neighbour z £ Stg-i \ .^i-i, we set Zi = Zj-i U {z} for any such neighbour 
z. If no such neighbour exists, we set Zi = Zi^\. Note that C C • • • C 
is a random sequence of sets, and \Zs\ < N . 

We claim that the following statement holds deterministically: if n is large 
enough, then there are at least s/2 values of i for which 

e(2/„St„_i\Z,_i)>6n/4. (46) 

Suppose that this claim does not hold, and let Y' C be a set of at least 
s/2 vertices yi for which e{yi,Sto-i \ Zi^i) < bn/4. Since Zi-i C Zs, for all 
J/ G y wc have eijj, St^-i \ Zs) < bn/A. Summing over y, we have 

e{Y',St,^i\Zs)<bn\Y'\/A. 

From (gS]) it follows that 

e^{Y\St,^i\Zs) < bn\Y'\/4 + n^7j{n). 
On the other hand, since Y' C Ttg, we have 

e^iY',St,-i)>bn\Y'\/2. 
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Consequently, 

eo = e^{Y',St,-i) - \ Z,) > bn\Y'\/A - n^ri{n). 

Since \ Y'\ > |F|/2 = 6(n), we see that if n is large enough, then {Y' , Zg) > 
bn\Y'\/5. But k~ is bounded by 1, so 

eo < \Y'\\Zs\ < \Y'\N = |y'|(6|ro|/10) < bn\Y'\/5. 

This contradiction establishes the claim. 

Suppose that for some i wc have e{yi,Sta-i \ ^i-i) > bn/4. Then the 
expected number of edges of G{An) from y to Stg-i \ Zi-\ is at least 6/4, so 
the probability that there is at least one such edge is at least bjh. 

From the claim above, and independence of edges from different vertices y, 
it follows that unless we reach \Zi\ = N ?kt some stage, the number of edges in 
the matching we find stochastically dominates a Binomial distribution D with 
parameters |yo|/2 and 6/4. More precisely, the probability that \Zs\ < iV is at 
most the probability that D < N. But D has mean |lolV8 > N ^ \Yo\b/l{). 
Since lYol = it follows (by Chernoff's inequality) that with probability 

1 - exp(-e(n)) we have \Zs\ > N. 

In summary, with probability at least 1 — exp(— 0(n)) we find a set Xq = Zg 
of at least 6|yo|/10 vertices of Sta~\ such that every x £ Xq is joined to some 
y — y{x) £ Yq by an edge of G{An), with the y{x) distinct. 

Suppose we do find such an Xq. As \Xo\ > 6|yo|/10, there is some ti < to 
such that Yi = Xq D Tt^ contains at least 6|yb|/(10£) vertices. If ii > 1 then, 
arguing as above, with probability 1 — exp(— 8(71)) wc find a t2 and a set Y2 
of at least 6^|lo|/(10^)^ vertices of Tt^ joined in G(A„) to Yi, and so on. As 
the sequence ^i, • • ■ is decreasing, this process terminates after s < £ steps 
with ts = 0. Hence, with probability 1 — exp(— 8(77.)) we find a set Yg of at 
least (6/(10£))'^|yo| = 6(«) vertices of Tq = = AT joined in G(^„) by vertex 
disjoint paths to vertices in Y, completing the proof of Lemma [2. 141 □ 

As in p], Corollarv l2 . 1 21 and Lemma[2T4]easily combine to give Theorem ll.il 

Proof of Theorem \1.1[ Let G„ = G(A„). By Corollary 12.121 there is some lu = 
uj{n) with Lu{n) — > 00 such that N>^{Gn)/n ^ p{k). We may and shall assume 
that uj = o{n). Since 

Gi(G„) + G2(G„) < max{2w, N>^{Gn) + < p{K)n + Op(n), 

it suffices to prove that if n is irreducible then 

Gi(G„) > p(K)n + Op(n). (47) 

If p{k) = 0, then this statement holds vacuously, so suppose that k, is irreducible 
and p(k) > 0. 

Fix < e < p(k)/10. By [H Theorem 6.4] we have p((l - 7)^) p{k) as 
7 -^• 0. Fix < 7 < 1 such that p((l - 7)^) > p(k) - e. 



32 



Let = G((l — j)An) and G" = G{'-fAn) be independent. We may and 
shall assume that G^ U G^ C G„. Applying Corollary 12.121 to the sequence 
(1 — ■y)An, which tends to (1 — 7)^ in (5q, we see that there is an w = uj{n) 
tending to infinity such that 

N>UG'n) > (p((l - 1)^) - e)n > {p{k) - 2e)n (48) 

holds whp. Let us condition on GJ^ assuming that does hold. Let B be the 
set of vertices of GJj in components of size at least w (we call these components 
large), so \B\ > {p{n) — 2e)n. 

If Gi(G„) < (p(k) — 3e)n then there is a partition [X^Y) of B such that 
\Y\ > en, with no path in G„ joining X to Y. Let us call such a partition 
bad. Since G^ C G„, each of X and K must be a union of large components 
of G^, so there are at most 2"/'^'^"') choices for [X,Y). But the probability 
that a given pair (X, Y) is bad is at most the probability that there is no path 
in G^ C G„ from X to Y; by Lemma 12.141 this probability is exp(— 8(n)). 
Hence the expected number of bad partitions is o(l). and whp there is no such 
partition. Thus Gi(G„) > (p(k) — 3e)n whp. Letting e — > 0, the bound ((17)) 
follows, and this is all that is required to complete the proof of Thcorem ll.il □ 

2.5 The reducible case: proof of Theorem 11.21 

In this subsection we shall justify the terminology by showing that one can 
reduce the reducible case to the irreducible case. Surprisingly, in this setting 
(unlike that of [5]), this is not quite immediate. 

The key step is a lemma allowing us to partition a sequence of matrices 
converging to a reducible kernel. By the restriction Kg of a kernel k to a set 
S C [0, 1] we simply mean the function obtained by restricting KtoSxS, which 
we may think of as a kernel on a measure space that is no longer a probability 
space. It will often be convenient to consider the rescaled restriction k'^ : when S 
is an interval (which we can always assume) this is the kernel on [0, 1]^ obtained 
by linearly 'stretching' Kg in the obvious way. 

Lemma 2.15. Let k be a reducible kernel and (5i,52) a partition of [0,1] 
with < /.j(5i), /i(52) < 1 such that Kg-^ is irreducible and k is zero a.e. on 
Si X // (An) is a sequence of non-negative symmetric matrices such that 
6\j{An,K) — ^ then we may find for each n complementary subsets Vn,i and 
Vn,2 of [n] such \Vn^i\ ^ fi{Si)n and SoiAn.i, n'l) — > 0, where k[ = n'g, is the 
rescaled restriction of k to Si and An^i is the principal minor of An obtained 
by selecting the rows and columns indexed by Vns- Moreover, the sum of the 
entries of An corresponding to (i,j) G Vn.i x Vn,2 is o(n'^). 

In other words, we may split the vertex set of the random graph G(A„) into 
Vn.i and Vn.2 so that the corresponding random graphs have edge probability 
matrices converging to the restrictions of k to iSi and ^2 respectively (after 
suitable rescaling). 
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Proof. Suppose that (5n(A„, k) — > 0. Let (t„) be a sequence of measure-preserving 
bijections from [0, 1] to itself, corresponding to rearrangements of the kernels 
ka^- Let In,i = ((« — l)/n,i/n] denote the subinterval of [0,1] corresponding 
to vertex i, i.e., to the ith row/column of A„. Then, in the rearrangement, 
In,i n Tn{Sj) is the portion of that is rearranged to correspond to part of 



Sj. We write 



min ii{lna n r„(5j )) 

j — 1^2 



for the extent that Inj is split between Si and ^2, noting that < Sn,i < 
We call the sequence (t„) good if 

ll^r^-'^lb^O^ (49) 



and 



S„,,; = o(l). 



Such a good sequence corresponds to rearranging An to be close to k in the 
cut norm, while mapping almost every vertex either almost entirely into Si 
or almost entirely into ^2. It is not too hard to check that if such a sequence 
exists, then the first conclusion of the lemma follows; we omit the tedious details, 
noting only that since k is integrable, for any subsets X„ of [0, 1]^ with measure 
tending to we have k — >■ 0. This shows that changing our rearrangement 
on a set of measure o(l) will not affect cut norm convergence. To see that the 
final statement follows, let Un.j be the subset of [0,1] corresponding to Vnj. 
Then 



< \\^A:^-4n + / '^ = 0(1), 

"'r-l(C/„.i)XT-i(C/„,2) 

since T~^(Un,j) differs from Sj in a set of measure o(l). 

It remains to prove that a good sequence exists. By hypothesis, there is a 
sequence (r„) such that (|49l) holds; as we shall see, any such sequence must be 
good! Indeed, suppose s„ does not tend to zero. Then passing to a subsequence, 
we may assume that s„ > 6 for every n, for some S > 0. 

For every n in our (sub) sequence, and each i € [n], pick subsets Ei i,Ei 2 of 
In.i of measure s„.i with Eij C r„(iSj); this is possible by the definition of Sn,i- 
Finally, for j — 1, 2, let Ej = IJiLi -^i-j' noting that Ej depends on n, and that 

IJ.{Ej) = Sn > 5. 

Since t^^{E2) C ^2, we have Jt-i-(^e2)xSi = 0- From P^j) and the definition 
of the cut norm it follows that J^^^^ j-^j^-) ka„ = But 

/ '^A„ = / KA„, 
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since ha„ {x, y) depends on x only through which interval In,i the point x lies 
in, and Ei and E2 intersect each /„^i in sets of the same measure. Hence, 
/iJixr„(5i) = ^^ing ® again, / = Sr-\E^)xs^ " 

But is irreducible, so for a.c. x in 5i we have f{x) — J^^ K(x,y) dy > 0. 
It follows that there is some 7 > such that the integral of / over any subset 
of Si of measure at least 5 is at least 7. But / is exactly such an integral, since 
T^^{Ei) C Si, giving a contradiction. This contradiction shows that (r„) is 
indeed good, completing the proof. □ 

Using Lemma [2. 151 it is not hard to deduce Theorem 1 1 . 2 1 from Theorem ll.il 

Proof of Theorem Multiplying the kernel k by c, we may and shall assume 
that c = 1. 

Part (a) of Theorem 11.21 follows from the first statement of Theorem 11.11 
part (c) is a restatement of the second statement of Theorem 11.11 so it remains 
only to prove part (b). 

As shown in ^ Lemma 5.17], we may decompose k into irreducible kernels. 
More precisely, there is a partition (5i)^Q of [0, 1] with < iV < 00 such that 
each Si has positive measure, the restriction of k to Si x Si is irreducible for 
each i > 1, and k is zero a.c. off [jfLi Si x Si. 

By assumption, Sa{An,K) — >■ 0. Applying Lemma [2.151 repeatedlv. for any 
finite N' < N we may split the vertex set [n] of the graph Gn into A^' + 1 
subsets Vn^i, i ~ 0,1,..., A^', such that, for each i ^ 0, \Vn,i\ ^ fJ'{Si)n and 
(5n(A^ j, — >■ 0, where A'^ , is the submatrix of An corresponding to Vn^i, and 
k'^ = Kg. is the rescaled restriction of k to Si. Let Gn,i be the subgraph of G„ 
induced by Vn^i. 

In what follows it is convenient to add zero rows and columns to A'^ i to 
obtain an n-by-n matrix An^i, and to consider the kernel Ki on [0, 1]^ agreeing 
with K on Sf and equal to zero off this set. It is easy to check that da [A'^ i , ^ ■) — >• 
implies (5n(A„^i, Ki) — >■ 0. Although Ki is formally reducible, it is so only in a 
trivial sense (called quasi- irreducible in U), and by rescaling suitably it is easy 
to check that Theorem 11.11 applies to such kernels (with, as it happens, no extra 
factors from the rescaling), so by Theorem ll.il we have Ci{Gn,i)/n A p{Ki) for 
each i > 1. 

By assumption, \\T^\\ > 1. But 

lir^ll = sup||r,j|, (50) 

i 

SO there is some i with HT^. || > 1. We choose A^' > i. Since Ci(G'„) > Ci{Gn,i), 
it follows that Gi{Gn) ~ 8(n) whp as claimed. Finally, suppose that k is 
bounded, by M, say. Since jlT^JI < Mp,{Si), only finitely many of the T^. can 
have norm exceeding any constant, and the supremum in (|501) is attained, say 
at i = J. As noted in [3], the bound p{k) > (WT^W — l)/supK is implicit in [3]. 
Applying this to Kj, the final part of Theorem II ■2r b) follows. □ 

Note that we cannot say what the limiting size of the giant component is in 
the reducible case: we know that there are Op(n) edges joining different Gn.i, 
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but have no further control on these edges (which may be completely absent), 
so we do not know whether they link the largest components in the different 
Gn,i or not. Thus Ci(G„)/n may be as small as max^ p{Ki) + Op(l), or as large 
as p{k) + Op(l) = J2i P{i^i) + Op(l)- 

Let us close this subsection with a conjecture. By a rearrangement Bn of 
a matrix An we simply mean a matrix obtained from An by applying some 
permutation to the columns, and the same permutation to the rows. 

Conjecture 1. Let k he a kernel, and {An) a sequence of non-negative sym- 
metric matrices in which An is n-by-n, such that (^□(^„,k) — > 0. Then there 
exist rearrangements Bn of each An such that \\kb„ — — > 0. 

A proof of this conjecture would give a simpler reduction of the irreducible 
case to the reducible one. We can prove versions of this conjecture with various 
additional assumptions. Suppose first that k is of finite type. Then the proof 
of Lemma |2 . 151 adapts easily to give the desired rearrangements: first show that 
in rearrangements (almost) realizing the cut distance, there is no significant 
splitting of vertices between the parts of k (unless two parts of k are 'equivalent', 
but then they may be united into a single part). This leads eventually to a 
rearrangement mapping almost every vertex to some subset of some part of k; 
since n is constant on its parts, the subset is irrelevant and may be taken to be 
an interval, leading to the required Bn- 

On the other hand, suppose that both k and the entries of all An are uni- 
formly bounded, without loss of generality by 1. Then approximating k by some 
n-by-n kernel, and using a result of Borgs, Chaycs, Lovasz, Sos and Vcszter- 
gombi [5] that if two n-by-n kernels bounded by 1 are within distance S in the 
cut metric, then there are rearrangements of the corresponding matrices that 
are within 32i5^/^'' in the cut norm, one can find Bn with \\Bn — k\\d 0. 

2.6 Stability 

In this subsection we shall prove our stability result. Theorem [OJ and deduce 
Theorem 11.41 As in [3] , we adapt an argument of Luczak and McDiarmid pT] 
showing that for c > 1 constant, whp the giant component of G{n,c/n) has 
the property that if its vertex set is divided into two pieces that are not too 
small, then there are many edges from one piece to the other. We shall need 
the following deterministic lemma from |21] . 

Lemma 2.16. For any e > 0, there exist rjo ~ '7o(^) > and uq such that the 
following holds. For all n > uq, and for all connected graphs G with n vertices, 
there are at most (1 + e)" bipartitions of G with at most rj^n cross edges. □ 

Using this and Lemma 12.141 we shall prove the following lemma, which 
corresponds roughly to the edge deletion case of Theorem 11.31 

Lemma 2.17. Let k be an irreducible kernel and {A„) a sequence of non- 
negative symmetric matrices such that (5n(A„, k) — >■ 0. For every e > there is 
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a S = 5{K,e) > such that, whp, 

CiiG'J>ip{K)-e)n 

for every graph G'^ that may he obtained from G{An) by deleting at most Sn 
edges. 

Proof. We may assume that p{k) > 0, as otherwise there is nothing to prove. 
Reducing e if necessary, we may and shall assume that e < p(k)/10. 

Let Bs be the 'bad' event that it is possible to delete at most Sn edges 
from Gn = G{An) so that in what remains no component contains more than 
(p(k) — e)n vertices; our aim is to show that for some constant S > we have 

nBs) ^ 0. 

Suppressing the dependence on n, given < 7 < 1, let Gi = G((l — 7)^n) 
and G2 ~ G{"fAn). As before, taking Gi and G2 independent we may assume 
that Gi U G2 C Gn = G{An). As noted earlier, by [4j Theorem 6.4] we have 
p((l - 7)k) /• p{k) as 7 -)■ 0. Fix < 7 < 1 such that p((l - 7)^) > p{k) - e/2. 

As in [21], let Ui denote the largest component Gi, chosen according to any 
rule if there is a tie, and consider the event 

Ai :={|C/i| > {p{K)^e/2)n}. 

Since — 7)^) > p[k) — e/2, applying Theorem 11.11 to Gi we see that Ai 
holds whp. 

By Lemma 12.141 applied with 7K in place of k, there exist constants a > 
and c > such that, given two disjoint sets X, Y oi vertices of G2 with 

\X\, \Y\ > en/2, we have 

F{X r) > 1 - e-™ (51) 

for all large enough n, where X Y is the event that there are at least k 
vertex disjoint paths from X to F in G2. Let 77 = 77o(c/2), where 7yo(-) is the 
function appearing in Lemma 12.161 and set 

S = min{(p(K) - e/2)r], a/2}. 

Suppose that B = Eg and Ai both hold. Then there is a set E of at most 5n 
edges of G„ such that in G^ = G„ — E there is no component with more than 
{p{k) — e)n < \Ui\ — en/2 vertices. In particular, there is a bipartition {X,Y) 
of Ui with \X\, \Y\ > en/2 such that there is no path in G^ from X to Y. But 
then two conditions must hold: (i) in Gi there are at most Sn < ri\Ui\ edges 
from X to Y, and (ii) it is possible to separate X from Y in G2 by deleting at 
most Sn < an edges. 

Let us condition on Gi, assuming that Ai holds. Then by Lemma r2.161 if n 
is large enough, there are at most (1 + c/2)l^il < (1 + c/2)" < e'^"/'^ bipartitions 
{X,Y) of Ui with \X\, \Y\ > en/2 satisfying property (i). By for each of 
these bipartitions the probability that it has property (ii) is at most e~™. It 
follows that V{B n ^1) < e^/^e""' = o(l). Since Ai holds whp, we thus have 
F(B) = 0(1), as required. □ 
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To handle the deletion of vertices rather than edges we simply show that 
whp all small sets of vertices meet few edges. 

Lemma 2.18. Let n he a kernel and 5 > Q a real number. Then there is a 
7 > such that, if (An) a sequence of non-negative symmetric matrices with 
S\j{An,K) — > 0, then whp every set of at most jn vertices of G{An) meets at 
most 6n edges. 

Proof. For < 7 < 1 let /(a) = sup J^^^^^ ^^K,{x,y) dfx{x) dfj,{y), where the 
supremum is over all subsets A of [0, 1] with ij,{A) < 7. Since k is integrable, 
we have 7(7) — > as 7 — > 0, and there is some 70 with /(70) < 3/4. Let us fix 
7 ^ 7o chosen small enough that (e/7)''' < e^^^^, say. 

Given a set U of vertices of G„ = G{An), let i^{U) denote the expectation of 
the sum of the degrees of the vertices in U. If \U\ < jn, then from the definition 
of the cut metric wc have 

Kf/)/n</(7) + fa(^n,«), 

so for n large enough we have iy{U) < Sn/2 for all such U. The number of edges 
incident with U has expectation at most ^{U), and is a sum of independent 
indicator variables. It follows from the Chernoff bounds that the probability 
that a given U meets at least 5n edges is at most e"''"^^", say. Since there 
are at most (^"^) < (e/7)''" < e''"/20 choices for U with \U\ = [771], the result 
follows. □ 

Wc are now ready to prove Thcorcm ll.3l 

Proof of Theorem \1.S[ Recall that will be obtained from G„ = G(A„) by 
deleting at most 6n vertices, and then adding and deleting at most 6n edges. 
Considering when Gi(GJ^) is maximized or minimized, it clearly suffices to prove 
that if S is chosen small enough, then whp Gi(G^) > — e)n for all such 

G'n obtained by deletion only, and that whp Gi(GJ^) < (p(k) + e)n for such G^ 
obtained by adding edges to G„. 

The first statement is immediate from Lemmas 12.171 and 12.181 as in (4] ; we 
omit the simple details. 

The second statement follows easily Lemma [2.111 the argument is identical 
to that in [4]. Simply choose k such that J2k'<k Pk'i''^) > 1 ^ p{i^) ^ £/3; then 
by Lemma 12.111 there arc whp at least (1 — p{k) — e/2)n vertices of G„ in 
components of size at most k. Set S = e/(4fc), and note that adding at most 
6n edges changes the number of vertices in components of size at most k by at 
most 2kSn = en/2. □ 

We now turn to the proof of Theorem ll.41 giving exponential tail bounds on 
the size of Gi(G„). 

Proof of Theorem \H\ In proving the lower bound on Gi(G„), wc may assume 
that e < p{k), and in particular that p{k) > 0. Given a graph G, let D ~ D{G) 
be the minimal d such that it is possible to delete d vertices from G to obtain 
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a graph G" with Ci(G") < — e)n. Note that if Gi and G2 differ only 

in the set of edges incident with some vertex v, then \D{Gi) — D{G2)\ < 1. 
Theorem 11.31 imphes that for some S > we have ED(G'„) > 5n for aU large 
enough n. Constructing G„ by making n independent choices, where the ith 
choice is the set of edges ji, j < i, it follows from McDiarmid's inequality [35] 
that 

P(Gi(G„) < ip{K) - e)n) - P(i?(G„) - 0) < e-2(^")V« ^ ^-^s^n^ (52) 

(Of course, one can instead use the Hocffding-Azuma inequality, in which case 
the factor two in the exponent is in the denominator.) 

Turning to the upper bounds on Gi(G„) and G2(G„), fix fc > 1 with 
p<k{K) = J2k'<kPk'{i^) > 1 - p{i^) - e/4, and consider iV„ = iV<fc(G„). We 
have ENn/n — > p<k{K.) by Lemma [2.8[ so for n large enough we have EiV„ > 
(1 — p{k) — e/3)n. We shall show that 

F{\Nn - EiV„| > en/2) < e"'^" (53) 

for some 7 > 0; then, for n large enough, 

P(Gi(G„) + G2(G„) > {p{k) + e)n) < P(7V>fe(G„) + 2k> {p{k) + e)n) 

< P(7V„ < ENn - en/2) < e"'^". 

Together with ([52]) this gives the required bounds on Gi(G,i). For the bound 
on G2(G,i), wc use ([52l) to bound Gi(G„) from below, and replace e by e/2. 

In our proof of ([55)1 the key point is that iV<fc(G) is edge-Lipschitz: if G and 
G' differ in one edge, then |A^<fc(G) — A^<fc(G')| < 2k. To prove concentration, 
we apply Talagrand's inequality [24] in the form of [181 Theorem 2.29]. With 
N ~ (2), the independent variables Zi, . . . , Zjq are the indicator functions of 
the events that the individual edges are present. Let /(G„) ~ f{Zi, . . . , Zjy) = 
n — Nn = Nyk{Gn)- Then changing one Zi changes iV„, and hence /, by at 
most Ci ~ 2k. Whenever /(G„) > r, then taking (the edge set of) one spanning 
tree for each component of size greater than fc, there is a certificate of size at 
most n for the event that /(G,i) > r. Hence we may take ■0(r) = (2k)^n for all 
r, and Talagrand's inequality gives 

P(|/(G„)-TO|>t)<4e-*'/(i6fc^"), 

where m is the median value of f{Gn)- As usual (see, e.g., [E]), it then follows 
that the mean and median are close (within 0{y/n)), and recalling that Nn = 
n — /(G„), for n large enough we obtain (|5^ with 7 = £^/(70fc^), say. □ 

3 Extension to hypergraphs 

In this section we shall prove an extension of Theorems 11.11 and 11.21 to hyper- 
graphs. Alternatively, this may be thought of as an extension of the random 
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graph model with clustering introduced in [5] . Most of our arguments are simple 
modifications of those in previous sections, so we shall only outline them. There 
are one or two places where adapting the proof is not so easy, and there we shall 
give more detail. 

Let (5, n) be a probability space. Wc write Wr for the set of all integrable 
non- negative functions W : 5'' — > [0,c>d), and Wr.sym for the subset of such 
functions that are symmetric under permutations of the coordinates. Often 
we shall call a function G Wr.sym an r-kernel. A hyperkernel k is simply a 
sequence (K,.)r>2, where Kr is an r-kernel. The integral i{K) of a hyperkernel is 
defined to be 



r>2 



and a hyperkernel k is integrable if < oo. 

The cut norm has a natural extension to r-kernels or indeed to ^^{5"^) D Wr- 
As before, we consider two slightly different definitions: for W G L^(5'") set 



\W\\u,i := sup 

Si,...,S, 



/ W[xi,...,Xr) , (54) 



where the supremum is over all r-tuplcs of measurable subsets of S. 
Alternatively, we may consider 



\W\\u.2 ■= sup 

ll/i||oc,,---,||Mloo<i 



fl{:Xl) ■ ■ ■ fr{Xr)W[xl, . . . ,Xr) ■ (55) 



Much of the time it makes no difference which version of || • jj^ wc consider: as 
before, in the supremum in (|55p we may assume that each fi is a ±1 function, 
and we see that 

l|VKlb,i<||VK|b,2<2''-||M/|b,i. 

While ((55)) is the more natural definition from the point of view of functional 
analysis, we shall in fact take ([5^ as the definition for most of this section, 
writing ||VF||n for ||W^||n,i - it turns out that we obtain a very slightly stronger 
result this way. 

Given a family = {Wr)r>2 with Wr G Wr, set 



r>2 •^'5'^ 



and 

\m\u=Y.''\\^r\\u, (56) 
r>2 

where || • l|n = || • ||n,i- The reason for the factors of r above will become clear 
shortly. 
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Note that while considering a single value of r, it is irrelevant whether we 
use II • II □ 2 or II • lln^i. However, as soon as we sum cut norms for different r, the 
potential factor of up to T may make a difference. All our results will apply 
using II • II □ 2 instead of || • ||n_i, but they would then be slightly weaker, as fewer 
sequences of hyperkernels converge in the resulting norm. 

Note that for W £ L^{S'^) we trivially have 

W < llM^lb < llVKlUi, 
so 

< \m\u < \\w\\l^- 

As in [5], the quantity i{W) will play a key role in various approximation argu- 
ments; the inequality |i(JF)| < ||K||n is key to making these arguments work 
here. 

Given a hyperkernel k and a measure-preserving bijection t : 5 — > 5, let 
gf"^) = (Kr^^)r>2 be the hyperkernel defined by 

kIJ'^Xi, . . . ,Xr) = Kr{T{xi),...,T{Xr)). 

We call a k^'^^ a rearrangement of k, and write k' '--^ k if k' is a rearrangement 
of K. The cut metric extends to hyperkernels on [0, 1] as follows: 

(?□(«, inf ||k — K"||n. 

For hyperkernels on general probability spaces, which need not be the same, we 
use couplings to define 6\j. 

Turning to graphs, our next aim is to define an extension of the random 
graph G{An). 

By an n-by-n hypermatrix Hn we mean a sequence {Hn^r)r>2 where each 
Hn^r is an r-dimensional array with entries hi-^i2...i^ > 0, 1 < ii, . . . , v < that 
is symmetric under all permutations of the coordinates. There is a hyperkernel 
K = K{Hn) = (Kr)r>2 naturally associated to a hypermatrix each IS a 
piecewise constant function on [0, 1]'' whose value on a certain hypercube of side 
1/n is given by the appropriate entry of Hn^r- 

Turning to the random hypcrgraph, as in [S], the natural normalization in 
the hypcrgraph case is unfortunately not the same as in the graph case. Roughly 
speaking, for each entry hi^i.^,,,i^ of each wc shall add a hyperedge on the 

corresponding vertices to our hypcrgraph with probability hi-^i^,,,i^/n^~^. Unfor- 
tunately this means that the probability that a particular r-vertex hyperedge is 
present is then (roughly) r\hiyi^,,,i^/n'^^^, and in particular 2hij/n in the graph 
case. 

Formally, given a hypermatrix i7„, let 'H(Hn) be the random hypcrgraph 
on [n] in which edges are present independently, and for any 2 < r < n and 
zi < 12 < • • • < irj the probability that the hyperedge iii2 • • • v is present is 

min{r!/iiji2..,i^/n''"\ 1}. 
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Alternatively, it is often to convenient to consider the Poisson multi-hypergraph 
version of 'H{Hn): here the number of copies of a hyperedge iii2 • ■ ■ ir is sim- 
ply Poisson with mean r\hi-^i^,,,i^/n^~^, and these numbers are independent for 
different hyperedges. 

Turning to the graph, let G(iJ„) be the simple graph underlying H{Hn), 
obtained by replacing each r-vertex hyperedge by a complete graph on r ver- 
tices, and replacing any multiple edges by single edges. In the Poisson multi- 
hypergraph variant, we keep multiple edges. 

Remark 3.1. We call an entry /lijia - ir of some H„^r diagonal if ik = ii for 
some k £. Note that in the definitions of H{Hn) and G{Hn), such entries play 
no role. We shall see later that, as in the graph case, convergence of (i?n) to k 
in S\j is unaffected by setting all diagonal entries to 0, so (once we have shown 
this), we may assume without loss of generality that all diagonal entries are 0. 
However, we do not impose this as a condition of our results, since there is no 
need to do so. 

Given a hyperkernel k, let be the compound Poisson Galton-Watson 
branching process associated to k; for the formal definition see [S]. We write 
p{k) for the survival probability of X^. 

As in [5], let be the edge kernel corresponding to k, — (^r): defined by 

=i:.(.-i)/ (57) 

Note that Hc may be viewed as a (rcscalcd) 2-dimcnsional marginal of the hy- 
perkernel K. As in [5] , a hyperkernel k is irreducible if the corresponding edge 
kernel is irreducible. The natural extension of Theorem 11.11 to hyperkernels is 
as follows. 

Theorem 3.2. Let k be an irreducible, integrable hyperkernel and (Hn) a se- 
quence of hypermatrices such that (5n(iJ„,K) — > 0. Then Ci{G{Hn))/n ^ 
andC2{G{H,,))^o^{n). 

Arguing as in the proof of Lemma 11.71 one can show that Theorem 13.21 
extends the corresponding result of [5l. 

In Theorem [32] we define (5n using |j • for the cut norm. Since || • < 
II ■ ||n,2) the corresponding result for the more natural definition using || • ||n 2 
follows immediately. 

The heart of the proof of Theorem 13.21 will be Lemma 13.31 below, showing 
that under an additional assumption, the number of vertices in components of 
each fixed size is 'what it should be'. Later we shall first remove the additional 
assumption, and then pass from 'large' components to a single giant component. 

We say that a hyperkernel k = (k^) is R-bounded if is zero for r > i?, in 
which case we shall often speak of the hyperkernel k = (tr)^L2- Correspond- 
ingly, a hypermatrix iJ„ = (i?n,r)r>2 is R-bounded if i/„,r is the zero matrix for 
r>R. 
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As in [S], we write Pfe(is) for the probability that the branching process 
consists of k particles in total. Recall that Nk{G) denotes the number of vertices 
of a graph G in components of order k. 

Lemma 3.3. Let R > 2 be fixed. Suppose that k is an R-bounded hyperkernel 
and (Hn) is a sequence of R-bounded hypermatrices such that 6\j{Hn,K) — > 0. 
Then for each k > 1 we have Nk{G{H„))/n A Pkin)- 

The proof of this lemma will take up the next several subsections. The 
deduction of Theorem 13.21 will then be relatively easy. 

3.1 Eliminating large edge probabilities 

Given a hypermatrix _ff„, for r > 2 let A„ be the matrix with entries 

4^ = -~^'^-'^EE---E'^^.---v' (58) 

and let 

E ''(^ " ^)^"-'- (59) 

r>2 

be the marginal matrix corresponding to iJ„, with entries a^ . Note that the 
kernel ka„ defined from An is simply the edge kernel Ke corresponding to K{Hn). 
Also, in the Poisson multi-graph form of our model, if all diagonal entries are 
zero, then the expected number of ij edges in G{Hn) is exactly aij/n. (See 
Remark 1 3. 11 ) 

Given Wr S L^{S^), let Wr be its marginal with respect to the first two 
coordinates, defined by 

Wr{x,y) = / Wr{x,y,X3,...,Xr)dn{x3)--- dfi{Xr). 

Note that ^ 

llWr-lb < WWrWu- (60) 

Indeed, to see this simply take 5*3, . . . ,Sr ~ S in ([51)) . or /a, . . . , = 1 in ([55]). 
An immediate consequence is the following lemma. 

Lemma 3.4. Let R > 2 be fixed, and suppose that (Hn) is a sequence of R- 
bounded hypermatrices and k, an R-bounded hyperkernel with (5n(-ff„,K) — >■ 0. 
Then ^□(A„,Kc) — > 0, where An is the marginal matrix of Hn, and Ke is the 
edge kernel of k. 

Proof. By definition of (5n, there are measure-preserving bijections t„ : 5 — > 5 
such that \\k{H„) — K('^"'||n — > 0. With k = (Kr)^=2: writing k'^ for the r-kernel 
corresponding to iJ„,r, this says exactly that J2f=2''^\\'^'r ~ ^r^"''l|n ~^ 0. Using 
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(|60p . and noting that taking marginals commutes with rearrangement, it fohows 
that J2r=2 ^11 ^-4n r ^ '?r^"''||n ^ 0. Since || • ||n is a norm on L^{S'^), we have 

r=2 

since changing the factor r to r(r — 1) docs not affect convergence to zero. Hence 
5n(A„,Ko)^0. □ 

Remark 3.5. To obtain a resuh analogous to p.4p without the i?-boundedness 
assumption, we would have to redefine 6u for hyperkernels, replacing the factor 
r in (|56p by a factor r(r — 1), and only considering 'edgc-intcgrable' limits k, 
i.e., hyperkernels with r{r — 1) J Kj. finite. 

Let us call a sequence (Hn) of hypermatrices well behaved if two conditions 
hold: every diagonal entry is zero, and maxv4„/7i — >■ as n — > oo, where max An 
is the largest entry of the n-by-n marginal matrix An corresponding to Hn ■ Note 
that if (Hn) is well behaved, then the probability that some particular edge ij 
is present in G{Hn) is o(l) as n — >■ oo, where the bound is uniform over edges. 

Lemma 3.6. Let R > 2 be fixed, and suppose that (Hn) is a sequence of R- 
bounded hypermatrices and k is an R-bounded hyperkernel with 5u[HmtS) ~^ 0- 
Then there is a sequence of well-behaved R-bounded hypermatrices {H'n) such 
that \\K{Hn) - ti{H'n)\\L^ and Sa{H^, k) 0. 

Proof. Let An be the marginal matrix corresponding to Hn and let Kc the 
edge kernel corresponding to k. Then by Lemma 13.41 we have 6\j{An, Hc) ^ 0. 
By Lemma 12.11 there is a function M{n) with M{n) = o(n) such that only 
o(n) entries of An exceed M{n), and the sum of these entries is o(n^). This 
immediately implies that the sum of any n entries of An is o(n^). 

Call an entry a^- of An bad if either Oij > M{n) or i = j. Let S be the 
sum of the bad entries, so 5 = o(ri^). To define Hn, simply modify Hn by 
setting to any entry hi-^i^..^^ of Hn^r such that 0^^.^^ is bad for some pair ik, ig, 
k < i. (In other words, we replace all entries contributing to bad entries atj in 
the marginal by zero.) Then Hn is a hypermatrix, and its marginal A'n = (a^^) 
satisfies a-^ < a-y with a-^ = whenever is bad. Thus {Hn) is well behaved. 

Finally, for each r, we may think of modifying Hn,r to obtain Hn r in (2) 
stages, in each one fixing k and £ and setting to zero entries hi-^i^,,,i^ for which 
ciikit is bad. The sum of the entries set to zero at each stage is at most n^~'^S. 
It follows easily that 

MHn) >i{H'n)\\L^ < E (2)^^"' = ^^^/"'^ ^ 

The final statement follows immediately, since 

6a{Hn,K) = SnitiiHn),&m) < MHn) ~ &m\\n < ||s(iJ„) - s(i/;)jUi. 

□ 
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An immediate consequence of Lemma 13.61 is the following rather informally 
worded corollary. 

Corollary 3.7. In proving Lemma \3.S[ we may assume that (Hn) is well be- 
haved. 

Proof. Let {Hn) and k satisfy the assumption of Lemma [5751 and define (i?^) as 
in Lemma \TM Let G'^ = G{H'^) and Gn = G{Hn). There is a natural coupling 
of T-L{Hn) and T-L{Hn) in which the expected number of r- vertex hyperedges 
in the symmetric difference is at most n\\KH'^ ^, — kh„ (with equality if all 
diagonal entries are zero, at least in the Poisson multi-hypergraph version); by 
Lemma 13.61 this number is o{n). Since each hyperedge has at most R vertices, 
and so contributes at most (^) = 0{1) edges, summing over 2 < r < Rwe have 
E\E{G'n)AEiGn)\ = o{n). 

Now Sa{Hn,K) — 0, so if Lemma [3.31 holds in the well-behaved case, then 

Nk{G'n)/n A Pkiis)- Since adding or deleting an edge to a graph G changes the 
number of vertices in components of order k by at most 2A:, we have E| Affc(G„) — 
Nh{G'„)\ = o{n), so Nk{Gn)/n 4 pkin) follows. □ 

3.2 Hypertree integrals 

Throughout this subsection, we fix an integer R > 2. All hyper kernels will be 
i?-bounded, and all edges of all hypergraphs will have size at most R. 

A hypertree is simply a connected hypergraph containing no cycles, or, equiv- 
alently, a connected hypergraph V. in which 1^1 = 1 + I ~ 1); where the 

sum runs over all edges Ei of 

Given a hyperkernel k = (Kr)r>2 and a hypertree Ji, we shall define iisoi('H, k) 
in analogy with (|24|) . Unfortunately, there is a difference in the normaliza- 
tion, and the marginals need some further explanation. For the latter, given 
Wr e let 

XwA^) ^ (x) = / Wrix,X2,...,Xr)dfl{x2)-- - dp{Xr). 

The marginal of Wr with respect to the ith coordinate is defined similarly. 
Given k = (kj-)^=2i 

X{x) = X^{x) = Y,r^M- (61) 

r 

The reason for the extra factor r is that, as noted earlier, we essentially add a 
hyperedge on each ordered r-tuple vi,. . . ,Vr with a probability Kr/rf~^, and 
because a particular vertex could appear in r places in the ordered r-tuple, it 
is then A(a;) that gives the expected number of hyperedges containing a given 
vertex. 

We now define tisoi(^, g) as an integral over 5'^' with one variable Xi for each 
vertex i of %. The integrand has a factor r\Kr{xi-^^ . . . ,Xi^) for each 7'-element 
hyperedge E — iii2 . . .ir of "H, and a factor e for each i 
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With this definition, Theorem 12.31 extends to the hyperkernel context. 

Theorem 3.8. Let R > 2 be fixed, and let H be a hypertree in which each 
hyperedge has at most R elements. Then k i— >■ iisoi(^,fi) is a bounded map on 
the space Wsym of R-bounded hyperkernels and is Lipschitz continuous in the 
cut norm. In other words, there exists a constant C (depending on R and % 
only) such that tisoi(^, fi) < C* for all k, G WsyX, and |iisoi('H, k) — tisoi('H, < 
C||S - jS'IId for all k,k! € wijfnl 

Rather than give a formal proof, we shaU briefly describe the modifications 
needed to the arguments in Subsection 12.21 Note that we make take || • ||n = 
II • lln.i or II • lln = II • II n. 2 in Theorem 13.81 on i?-bounded hyperkernels, these 
norms are equivalent. As in Subsection 12.21 in this subsection we use the norm 

I|-Ib,2. 

Firstly, note that Lemma l^T^ extends immediately: if W^, "= then 

I|Am/„-Aw;||li < ||VK,-Vl^;i|n. (62) 

(Perhaps the nicest way to see this is to note that, generalizing in the 
natural way, the cut norm of any c?-dimensional marginal of some W € L^{Sr) 
is at most ||VF||n, and that on L^{S), the norm and cut norm coincide.) 

Fix v.. Extending (^5]) . suppose that for each r-element hyperedge E of 
H we have a We & Wr, where Wr is the set of (not necessarily symmetric) 
non-negative functions Wr G L^{S^). Then we may define to{'H, {WE)EeE{n)) 
in analogy with (|25p. again without the exponential factors in tisoi('H,K). To 
reintroduce these, given any Wr G Wr and a = (ai, . . . ,ar) with each Oi > 0, 
set 

r 

Wrixi, ...,Xr) = Wrixi,. . . , x^) exp(-ai a|^^ (a;^)) , 

i=l 

in analogy with (|26p . 

The proof of Lemma 12.41 extends mutatis mutandis to give the following 
result. 

Lemma 3.9. For every fixed a > 0, the map W i— >■ W^ is Lipschitz continuous 
on yVr in the cut norm; more precisely, 

im-W^Wo < {2'- + r2-/e)\\Wi-W2\\n 

for all W\,W2 e Wr- Also, for every W G W^, the ith marginal of W^ is 
bounded by e^^/a;. □ 

As before, the first 2^ can be replaced by 1, but we do not care about the 
constant. 

There is one minor additional complication not present in the graph case, 
which we now describe. Given a hyperkernel k — (Kr)^^2^ ^^^h hyperedge 
E oi Ti with r vertices define We S Wr by 

r 

We^xi,. ..,Xr) = Kr{xi, . . . , x^) J]^ exp(-AK(x)/di) , (63) 
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where di is the degree in H of the ith vertex of E (in some arbitrary ordering) . 
Then we have 

tisol{n,K)^toin,{WE)EeEi-H)). (64) 

corresponding to ^7}. In the graph case we simply had Wij = but 
this no longer holds, since the marginals appearing in (|63)) are those of k, not 
simply those of the kernel Kr appropriate for r-element hyperedges. The extra 
complication is dealt with by Lemma 13.101 below. 

Given B > 0, let W^.s be the set oiW G Wr with all marginals bounded by 
B.li f e L^{S) and W eWr, define fW by 

{fW){xi, ...,Xr)= fixi)W{Xi,. . .,Xr). 

Suppose that W G and /i,/2 G L^{S). Then 

!l(/i - f2)W\\a < ||(/i - f2)WU^ = ||(/i - /2)A|Ui < - /2)|Ui, (65) 

where A is the first marginal of W. Now suppose that /i, J^, /{,..., G 
L^{S) with ll/illoo, ll/Illcx) < 1 for each and that W, W G Wr.s- Defining 
fi - ■ ■ frW and /{ • • • frW in the obvious way, we have 

!l(./i • • • frW) -(/{••• .CW')\\a <\\w- w'Wo + Bj2U- (66) 

1=1 

Indeed, we may write the difference as (/i • • • fr){W — W) plus r terms whose 
cut norms may be bounded by (j65p ; the cut norm of the first term is at most 
\\W - W'Wa by the analogue of 
With n fixed, let B = A(n)/e. 

Lemma 3.10. For each hyperedge E of %, the map k t-^ We is Lipschitz 
continuous with respect to the cut norm, and We belongs to Wr.B- 

Proof. Let r be the number of vertices in E, and let k = (ks)^2- Let We = nf, 
where a = (r/di, . . . , r/d,.). Since each Hs is symmetric, all its marginals are 
equal; we write As for any of these marginals. Then We — fi ■ ■ ■ frWs, where 

fi{xi) = exp(-Ag(a;i)/rfi + r\r{xi) / di) = exp f - sA,, (x, ) / dj] . 

s^r 

Since all marginals As are non- negative, we have < fi{x) < 1. Applying 
Lemma 13.91 to Kr tells us that We G Wr.s, and that the map k i— )■ We is 
Lipschitz continuous. Summing ([5^ over 2 < s < R, s r, tells us that each 
fi varies continuously (in L^) with k, and Lipschitz continuity of k i— )> We then 

follows from (|66p . Finally, We G Wr.B and < < 1 for each i trivially implies 

We G Wr^B- □ 

In the light of (|64)) and Lemma [3.101 it remains only to prove an analogue 
of Lemma [2.71 showing that to{H, {We)e£'h) is Lipschitz continuous with re- 
spect to the cut norm when we assume that each We G Wr.B- The proofs of 
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Lenima l2.6l and Lemma |2 . 71 carry over with trivial modifications, noting that for 
the latter when we delete a single hyperedge E with r vertices, our hypertree 
splits into r hypertrees (some of which may be trivial). 

3.3 Small components 

With the preparation above behind us, the argument of Subsection 12.31 goes 
through easily. Let us comment very briefly on the changes. Firstly, it is more 
convenient in this subsection to consider hypergraphs throughout. 

Given a hypergraph Ji, we write Nk{'H) for the number of vertices in com- 
ponents of order fc, N^i'H) for the number in tree components of order k, and 
N^iJ-L) for the number in non-tree components. 

The proof of Lemma [2.101 carries over easily to give the following result. 

Lemma 3.11. Let {H„) be a well-behaved R-bounded sequence of hypermatrices, 
andJin = ^{Hn) the corresponding random (Poisson multi-) hypergraphs. Then 
for any fixed k we have EN^{'Hn) = o{n). 

Proof. As in the graph case, we consider the number AI<k{'H) of components 
of a hypergraph T-l that contain a cycle and have at most k vertices. Since 
NkCHn) < fcM<fc(H), it suffices to prove that EM<k{'Hn) = o(n). 

When adding a hyperedge E to & hypergraph H, the quantity M<fc can 
increase only if E creates a cycle, i.e., contains at least two vertices i and j from 
some component CofH, and after adding Ji, the component containing E has 
order at most k. This certainly implies that E contains a pair {i,j} of distinct 
vertices from some component of order at most k. The rest of the proof follows 
that of Lemma [2. 101 using the fact that (i?n) well behaved guarantees that the 
expected number of edges of "Hn containing a particular pair {i,j} of vertices is 
o(l), uniformly in i and j. □ 

The remaining arguments in Subsection 12.31 carry over easily. 

Proof of Lemma \3.3\ Let (Hn) be a sequence of i?-bounded hypermatrices con- 
verging in Sa to an i?-boundcd hypcrkcrncl k. By CoroUar v 13 . 71 we may assume 
that (Hn) is well behaved. 

Given a hyperedge E = ii...ir with vertices contained in [n], let He = 
hii...ir be the corresponding entry of Hn.r, and /is = rlhEU^''^^'^^ the expected 
number of copies of E in H„ = 'H{Hn). Given a connected simple hypergraph 
T on [k] and a sequence v = (wi, . . . , Vk) of vertices of H„, for each hyperedge 
E = ii ... ir oi let v(ii') = Vi-^ . . . Vi^ be the image of E under the map i i— > Vi. 

As before, for a good sequence v, let Pv{^) = Pv{J^,Hn) be the probability 
that the image of J- under i i— > is present in and forms a component of 
Hn. Thus 
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where Eq is the set of ah potential edges of Hn that share at least one vertex 
with {vi, . . . , Wfc}. For any v, set 

k 

where A„(w) is the sum of the probabilities of all hyperedges meeting v. Note 
that An is exactly the marginal of the hyper kernel corresponding to H„, but 
here viewed as a function on [n] rather than on [0, 1]. 

If V is good, the only difference between p%{J^) and Pv{T) is that for each 
E Eq sharing s > 2 vertices with {vi, . . . ,Vk}, the factor exp(— /x^;) appears 
s times in p^iJ-) but only once in pv{J-). Since (Hn) is well behaved, for any 
i ^ j the sum oi fis over hyperedges E containing both i and j is o(l), so it 
follows as before that p%{J-) ^ Pv{J-)- 

Let T be a hypcrtrcc. Summing p%{T) over all sequences v wc obtain 
exactly ntisoi(T, k). The rest of the proof of Lemma [2.81 goes through essen- 
tially unchanged to show that the contribution from bad sequences v is negli- 
gible, and summing over hypertrees T , and using Lemma 13. Ill it follows that 
ENk{'Hn)/n — > Pkifi)- (Note that (^5)) holds unchanged for hypergraphs too, 
with the normalizations used here.) As before, considering disjoint copies of two 
trees gives convergence in probability, as required. □ 

Finally, we note that the result we have just proved extends from i?-bounded 
hyperkernels to general hyperkernels. 

Corollary 3.12. Let k be an integrable hyperkernel and (Hn) a sequence of 
hypermatrices with (^□(i?n,fi) 0, and set G„ ~ G(Hn). Then Nk{Gn)/n ^ 

Proof. Firstly, it makes no difference whether we work with the hypergraphs 
T-ln = T-l{Hn) or the underlying graphs Gn = G{Hn), as these have exactly the 
same components. 

Fix fc > 1. Let K = {Kr)r>2- For R > 2, set = {>^r)?=2^ ^"^^ similarly 
define H!^ by omitting all matrices Hn,r with r > R. Fix e > 0. Since k is 
integrable, we have 1(5^) ^(k) as i? — > 00. By Theorem 2.13(i) of [S|, we 
have Pkift^) Pkiti)- Hence there is some R such that i{K — k^) < £ and 

\Pkiti) ~ Pkiti"")] < e. (67) 

Fix such an R. From the definition of S^j, we have 

<e + 5u{K{Hn),n) =e + o(l). 

Coupling T-Ln and Ti.^ = T-L{H^) in the natural way so that the former contains 
the latter, the expected sum of the sizes of the extra hyperedges in Hn is at 
most ni(^K.{Hn) — £(-ff,f )) < {s + o{l))n. Since adding a clique of size r to a 
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graph G changes the number of vertices in components of size at most k by at 
most rfc, it follows that for k fixed we have E|iVfe('H„) — Nk{T-Ln)\ £ ken + o(n), 
so for n large enough, 

P(|iVfe(H„) - Nkin^)] > k^ < 2Vi, (68) 

say. Applying Lemma [531 to the sequence (H^), we have Nk{H^) = Pkin^) + 
Op(n). Since e > was arbitrary, the result follows from this, (p7)) and □ 

3.4 Proof of Theorem IST^ 

We have just seen that for each k we have the 'right' number of vertices of 
G{H„) in components of order k; it remains only to show, using the additional 
assumption of irrcducibility, that almost all vertices in large components in fact 
form a single giant component. 

Proof of Theorem \3.SX As usual. Corollary 13.121 implies that there is some uj = 
uj{n) — > oo, which we may take to be o(n), such that 

N>^{G{H^))/n^ p{k). (69) 

Let Gn = G{Hn). As in the proof of Theorem O in the light of dSH) it 
suffices to show that Ci(G„) > p{K)n + Op{n). In doing so we may of course 
assume that p{k) > 0. 

Fix e > 0. Theorem 2.12(i) of [S] tells us that as 7 we have p((l — 
7)5) p{&)j so there is some 7 with — ^)k) > p{k) — e. In the Poisson 
multi-hypergraph form, we may write 'Hn = 'H[Hn) as U T-L'^ where Ti'^ = 
y-iil — ^)Hn), "Hn = 'H(7i?n), and and are independent. 

Writing G'^ for the graph corresponding to T-L'n, applying (|69l) to ("H^) there 
is some w = a;(n) — >■ 00 such that 

N>UG'J > (p((l - 7)5) - e)n > {p{ti) - 2e)n 

holds whp. We shall attempt to use the hyperedges of to join up the large 
components of G'^. 

As in [5], the trick is to select one edge from each hyperedge, to obtain a 
graph. More precisely, let G'^ be the random multi-graph obtained from H'^ by 
replacing each hyperedge E of order r by one of the corresponding edges, 
chosen uniformly at random. From the Poisson nature of the model, different 
edges in G'^ are present independently. 

Let Bn = '2J2r>2^ri,r, whcrc An^r is the matrix defined by ([55]) . The edge 
probabilities in G'^ are given by 7 times the entries of Bn- (Note that the 
coefficient of An,r is smaller here than in ([55)) . by a factor 1/(2), corresponding 
to choosing one out of (2) edges.) 

Let T be the rescaled edge-kernel defined by 

t{x, y) = 2 / Kr{x, y, X3, X4,..., Xr) dfj.{x3) ■ ■ ■ dp,{Xr), 
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i.e., by replacing the factor r(r — 1) in ([57)) by a factor 2. Using (pO)) and arguing 
as in the proof of Lemma [3.41 but replacing each appearance of r(r — 1) by 2, 
it is easy to check that 5\j{kb„, t) — > 0; this time, since 2 < r, there is no need 
to truncate the sums over r. 

Now K is irreducible by assumption, which means exactly that Ke is irre- 
ducible. Since Ko and r are non-zero in the same places, it follows that r is irre- 
ducible. Since the graphs G" have the distribution G{'yBn), and ^□(B„, r) — 0, 
Lemma 12.141 tells us that given any two sets X and Y of en vertices of G'^ , the 
probability that there is no path in G'^ from X to F is exponentially small. 
As before we may apply this to all partitions of the large components of 
into two sets each containing at least en vertices to deduce that whp we have 
Gi(G„) > — 3e)n, completing the proof. □ 

Theorem 13.21 implies a result for branching processes corresponding to The- 
orem [L9l we leave the details to the reader. 

Finally, let us note that using the trick of selecting one edge from each 
hyperedge above, it is very easy to extend Theorem 11.31 to the graphs G(i/„) 
considered in Theorem 13.21 
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